The set intersection (∩_s) operator

Note: search (hash) structure need only record the presence of a key (see teaching note in this webpage)

However: we can use the ∩_B algorithm to compute ∩_S !

The set intersection (∩_s) operator - assumptions

In the slide presentation, I will use the ∩_B algorithm to compute ∩_S...

The one-pass set intersection (∩_s) algorithm - Example

{a,b,c,e} ∩_S {b,c,d}:

Phase 1: read S and build search index with (key, #occurence)

The one-pass set intersection (∩_s) algorithm - Example

{a,b,c,e} ∩_S {b,c,d}:

Phase 2: read R and decrement the search key count; output the key if count > 0

The one-pass set intersection (∩_s) algorithm - Example

{a,b,c,e} ∩_S {b,c,d}:

Phase 2: a ∉ search structure → discard a

The one-pass set intersection (∩_s) algorithm - Example

{a,b,c,e} ∩_S {b,c,d}:

Phase 2: count(b) > 0 → output b (and decrement count(b) is optional because b is unique in set !)

❮ ❯

initialize a search structure H on all attributes of S; /* =========================================================== Phase 1: Use 1 buffer and scan the SMALLER relation first. Build a search structure on the SMALLER relation to help speed up finding common elements. =========================================================== */ while ( S has more data blocks ) { read 1 data block in buffer b; for ( each tuple t ∈ b ) { insert t in H; // Build search structure // (hash table or search tree) } } /* ======================================================== Phase 2: Output only those tuples in R that are also in S We use the search structure H to implement the test t ∈ H efficiently !!! For H, we can use hash table or some bin. search tree ========================================================= */ while ( R has more data blocks ) { read 1 data block in buffer b; for ( each tuple t ∈ b ) { if ( t ∈ H ) { output t; // t in R and S } } }

One-pass Algorithm for ∩S (very similar to ∪S)

One-pass Algorithm for ∩_S (very similar to ∪_S)