Tweeking the
performance of the Zig-zag join algorithm
Observation:
Tweeking the
performance of the Zig-zag join algorithm
Tweek:
We will only
access the
data block
(containing the join tuple)
if they have the
same search key values:
Assumption:
accessing the
(smaller) index file
requires negligible
# disk IO operations
(We will ignore these operations in the total # disk IOs)
The Zig-zag Join Algorithm
Read the index entries of R with the next smallest key r ;
Read the index entries of S with the next smallest key s ;
while ( R ≠ empty and S ≠ empty ) do
{
if ( search key r < search key s )
{
Read the index entries of R with the next smallest key r ; (and repeat...)
}
else if ( search key s < search key r )
{
Read the index entries of S with the next smallest key s ; (and repeat....)
}
if ( search key r = search key s )
{
Use M-1 buffers to read in all tuples of R with search key r;
Use 1 buffer to scan in all tuples of S with same join value;
Join the tuples in Buf(R) and Buf(S):
Read the index entries of R with the next smallest key r ;
Read the index entries of S with the next smallest key s ; (Repeat....)
}
}
|
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
(r = 1) < (s = 2)
→
read the
next smallest join value
from R's index file
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
(r = 3) >
(s = 2)
→
read the
next smallest join value
from S's index file
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
(r = 3) >
(s = 4)
→
read the
next smallest join value
from R's index file
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
(r = 4) =
(s = 4)
→
join the
tuples
from relation R and S !!
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
(r = 4) =
(s = 4)
→
use M−1 buffers to
store
all the joining tuples from
R
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
(r = 4) =
(s = 4)
→
use 1 buffers to
scan in
all the joining tuples from
S
The Zig-zag Join Algorithm
- Example
(using a non-clustering index)
Read the
index files and
find next smallest join values:
Notice that:
you
must use
1 buffer
to access data blocks from
S
IO cost of the
Zigzag Join Algorithm using a
non-clustering index
Observation:
IO cost of the
Zigzag Join Algorithm using a
non-clustering index
When a
join value is
found, we
access
R's tuples:
Worst case:
we will access
T(R) blocks
(if accessing 1 tuple result in 1 disk IO operation)
IO cost of the
Zigzag Join Algorithm using a
non-clustering index
When a
join value is
found, we
access
S's tuples:
Worst case:
we will access
T(S) blocks
(if accessing 1 tuple result in 1 disk IO operation)
Total
IO cost:
❮
❯