Slideshow:
δ ( R ) |
R = { a, b, a, c, a, b } δ ( R ) = {a, b, c} |
Important observation:
|
initialize a search structure H on all attributes of R; while ( R has more data blocks ) { read 1 data block in buffer b; for ( each tuple t ∈ b ) { /* ===================================================== We need a search structure H to implement the test t ∈ H efficiently !!! We can use hash table or some bin. search tree ====================================================== */ if ( t ∈ H ) { discard t // duplicate !!! } else { insert t in H; // Help find duplicates move t to output // This is the first occurence } } } |
|
|
|
|
Recall:
|
|