Slideshow:
|
|
because:
|
|
|
|
|
Graphically:
|
Each sub-sub-relation must fit in ≤ M-1 buffers B(R) Size of each sub-sub-relation = -------- (M-1)2 B(R) Therefore: ------ ≤ M-1 blocks (M-1)2 Or: B(R) ≤ (M-1)3 blocks |
Pass 1: read R + write M-1 sub-relations = 2 B(R) Pass 2: read M-1 + write (M-1)2 sub-relations = 2 B(R) Pass 3: run one-pass algorithm = B(R) Total # disk IOs = 5 B(R) |
Pass 1: Hash input relation(s) into M-1 buckets (Same as pass 1 of the 2-pass algorithm) Cost: 2 B(R) disk IOs Size of each sub-relation at end: B(R)/(M-1) blocks |
# disk IOs = 2(k-1) B(R) + B(R) (if we don't count final output IO) = 2 k B(R) (if we include final output IO) Max file size ≤ (M-1)k blocks |