Slideshow:
R ⋈cond S |
R = R( name, dno ) S = S( dnumber, dname ) R = { (john, 1), (jane, 4) }; S = { (1, Research), (4, Payroll) }; R ⋈dno=dnumber S = { (john, 1, 1, Research), (jane, 4, 4, Payroll) } |
Assumption:
|
Algorithm: (the classic hash-join algorithm)
initialize a search structure H on JOIN attributes of S; /* =========================================================== Phase 1: Use 1 buffer and scan the SMALLER relation first. Build a search structure on the SMALLER relation to help speed up finding common elements. =========================================================== */ while ( S has more data blocks ) { read 1 data block in buffer b; for ( each tuple t ∈ b ) { insert t in H; // Build search structure // (hash table or search tree) } } /* ======================================================== Phase 2: Output only those tuples in R that have join attrs equal to some tuple in S We use the search structure H to implement the test t(join attrs) ∈ H efficiently !!! For H, we can use hash table or some bin. search tree ========================================================= */ while ( R has more data blocks ) { read 1 data block in buffer b; for ( each tuple t ∈ b ) { if ( t(join attrs) ∈ H (search structure) ) { for ( each s ∈ Bucket[ t(join attrs) ] ) output (t, s); // successful join } } } |
Buffer utilization when there are M buffers available:
|
|
|