Slideshow:
|
|
because:
|
|
|
|
|
|
Graphically:
|
|
Each sub-sub-relation must fit in ≤ M-1 buffers
B(R)
Size of each sub-sub-relation = --------
(M-1)2
B(R)
Therefore: ------ ≤ M-1 blocks
(M-1)2
Or: B(R) ≤ (M-1)3 blocks
|
Pass 1: read R + write M-1 sub-relations = 2 B(R)
Pass 2: read M-1 + write (M-1)2 sub-relations = 2 B(R)
Pass 3: run one-pass algorithm = B(R)
Total # disk IOs = 5 B(R)
|
Pass 1: Hash input relation(s) into M-1 buckets
(Same as pass 1 of the 2-pass algorithm)
Cost: 2 B(R) disk IOs
Size of each sub-relation at end: B(R)/(M-1) blocks
|
# disk IOs = 2(k-1) B(R) + B(R) (if we don't count final output IO)
= 2 k B(R) (if we include final output IO)
Max file size ≤ (M-1)k blocks
|