Slideshow:
M << B(S) + 1 ( << means: much smaller)
|
Therefore:
|
Possible solution:
|
(is multiple runs of the one-pass cartesian product algorithm using (M−1) blocks of the S relation)
Let M = # available buffers;
/* -------------------------------------------
Outer loop: read M-1 block of S
------------------------------------------- */
while ( S has more data blocks )
{
read the next M-1 data blocks of S; // B(S) > M-1 !!!
Rewind R; // Set read pointer back to the start of R
/* -------------------------------------------
Innerloop: read through R and compute ×
------------------------------------------- */
while ( R has more data blocks )
{
read 1 data block of R;
for ( each tuple s ∈ M-1 blocks of S and
each tuple t ∈ 1 blocks of R ) do
{
output (s, t);
}
}
}
|
Graphically:
|
|
Algorithm read S once: # disk I/Os = B(S) |
|
|
the cost of the nested-loop cartesian product algorithm is not symmetric:
Cost(R × S) ≠ Cost(S × R)
|
(Cost = number of disk block read)
B(R) = 10,000 B(S) = 5,000 M = 101 |
Configuration:
(1) Use 100 buffers to hold tuples of S
(2) Use 1 buffers to scan R
|
Configuration:
(1) Use 100 buffers to hold tuples of R
(2) Use 1 buffers to scan S
|