Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Problem description:
Find the best algorithm to
execute
⋈1 and
⋈2
for M = 101 -
click to pull out
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Simplified Problem :
Solution method:
brute force search for the
min. cost algorithms that
can operate with M = 101
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 1:
check if
we can use a 1-pass algorithm
for ⋈1:
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 2:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈1:
Next,
we find the
best algorithm for
⋈2
(considering the buffer utilization of
⋈1 !)
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Determining the
buffer utilization by
the ⋈1
execution:
(⋈2 is
inactive and
will not use any
buffers)
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Determining the
buffer utilization by
the ⋈1
execution:
Next:
we run pass 2 of the
2-pass (hashing-based) algorithm
(= a 1-pass algorithm on
each Ri and Si)
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Determining the
buffer utilization by
the ⋈1
execution:
Note:
pass 2 is run
for every chunk
Ri and Si and
pass 2 will
output tuples to
⋈2
(→ becomes active !)
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Cost (so far):
Click on image to
pull out (keep running cost)
Next:
determine the best suitable
join algorithm for
⋈2
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Prelude to considering the
implementation algorithm for
⋈2:
⋈1 is
actively using its buffers
to produce tuples for
⋈2:
I.e.:
we must find the
best algorithm for
⋈2
using M = 101 − 51 =
50 buffers !!
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 3:
check if
we can use a 1-pass
algorithm
for ⋈2:
Therefore:
we cannot use
the 1-pass join algorithm
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 4:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈2:
Note:
B(R⋈1S) ≤ 5000.
Therfore:
B(R⋈1S)/50 ≤ 100
(each chunk of the
first relation of
⋈2
≤ 100 blks)
Comment:
due to buffer recycling we
cannot test
for remaining 50 buffer ≥
sqrt( B(R⋈2S) )
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 4:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈2:
(This is the buffer recylcing step
that I referred to in the
last slide....)
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 4:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈2:
(Because otherwise, the
tuples with the
same join key will
be hashed into
a different bucket....)
Notice that:
size of each sub-relation of
R⋈1S is
≤ 5000/50 = 100 blks
size of each sub-relation of
T is
= 200 blks
So
pass 2 will
use chunks of R⋈1S as
build relation !
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Step 4:
check if
we can use a 2-pass (hashing based)
algorithm
for ⋈2:
Example on selection implementation algorithm for a query plan -
B(R⋈S) medium
Cost analysis:
❮
❯