Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Problem description:
Find the best algorithm to
execute
⋈1 and
⋈2
for M = 101 -
click to pull out
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Simplified Problem :
Solution method:
brute force search for the
min. cost algorithms that
can operate with M = 101
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 1:
check if
we can use a 1-pass algorithm
for ⋈1:
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 2:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈1:
Next,
we find the
best algorithm for
⋈2
(considering the buffer utilization of
⋈1 !)
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Determining the
buffer utilization by
the ⋈1
execution:
(⋈2 is
inactive and
will not use any
buffers)
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Determining the
buffer utilization by
the ⋈1
execution:
Next:
we run pass 2 of the
2-pass (hashing-based) algorithm
(= a 1-pass algorithm on
each Ri and Si)
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Determining the
buffer utilization by
the ⋈1
execution:
Note:
pass 2 is run
for every chunk
Ri and Si and
pass 2 will
output tuples to
⋈2
(→ becomes active !)
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Cost (so far):
Click on image to
pull out (keep running cost)
Next:
determine the best suitable
join algorithm for
⋈2
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Prelude to considering the
implementation algorithm for
⋈2:
⋈1 is
actively using its buffers
to produce tuples for
⋈2:
I.e.:
we must find the
best algorithm for
⋈2
using M = 101 − 51 =
50 buffers !!
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 3:
check if
we can use a 1-pass
algorithm
for ⋈2:
Therefore:
we cannot use
the 1-pass join algorithm
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 4:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈2:
Note:
B(R⋈1S) > 5000.
Therfore:
B(R⋈1S)/50 > 100
(each chunk of the
first relation of
⋈2
> 100 blks)
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 4:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈2:
Therefore:
we cannot use
2-pass algorithm to execute
⋈2
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 5:
check if
we can use a 3-pass (hashing based)
algorithm
for ⋈2:
Note:
each "sub-sub-relation chunk is
about B(R⋈S)/(502) blks
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 5:
check if
we can use a 3-pass (hashing based)
algorithm
for ⋈2:
Next, we
hash the
2nd input relation T in the
same manner...
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 5:
check if
we can use a 3-pass (hashing based)
algorithm
for ⋈2:
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 5:
check if
we can use a 3-pass (hashing based)
algorithm
for ⋈2:
Notice that:
size of each sub-sub-relation of
R⋈1S is
> 5000/2500 = 2 blks
size of each sub-sub-relation of
T is
= 10000/2500 = 4 blks
So
pass 3
may
use Tij (chunks of T) as
build relation if
R⋈1S
> 10000 !
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Step 5:
check if
we can use a 3-pass (hashing based)
algorithm
for ⋈2:
Therefore:
we can
always use
the 3-pass algorithm for
bowtie;2 when
B(R⋈1S) > 5000
Example on selection implementation algorithm for a query plan -
B(R⋈S) large
Cost analysis:
❮
❯