Slideshow:
|
|
|
i n i = ⌈log2(n)⌉ ---------------------------------------------------------- 2 3 or 4 ⌈log2(n)⌉ = 2 for n = 3 or 4 3 5,6,7 or 8 ⌈log2(n)⌉ = 3 for n = 5,6,7, or 8 |
How to compute the parameter i in Linear hashing:
i = ⌈ log2( n ) ⌉ = # bits needed to represent the value n-1 (And [0..n-1] = the range of reak logical bucket numbers) |
(The use of overflow blocks is necessary in the Linear Hashing method because adding a new hash bucket may not offload an overflowed hash bucket)
Hash index:
|
|
Example:
|
|
Parameter: n = current number of buckets in use i = ⌈ log(n) ⌉ Lookup( x ) { k = hash(x); // k = hash function value m = k % 2i; // Hash index (last i bits of k) /* ============================================= Check if m is a real or virtual hash bucket ============================================= */ if ( m ≤ n−1 ) { /* ========================================= m is a "real" bucket ========================================= */ Read Bucket[ m ] from disk; Search (x, recordPtr(x)) in Bucket[m] (including the overflow block); } else { /* ========================================= m is a "virtual" bucket Use Suffix-1(m) to map !! ========================================= */ m' = Suffix-1(m); // I.e.: m = 1xxxxxxxxxx // m' = 0xxxxxxxxxx // Note: m' is a "real" bucket !!! Read Bucket[ m' ] from disk; Search (x, recordPtr(x)) in Bucket[m'] (including the overflow block); } } |
|
if ( Avg occupancy of buckets > τ )
{
n++; // Increase # physical buckets
}
|
|
Example:
r if ( -------- > τ ) n × γ { n++; } |
Parameter: n = current number of buckets in use Insert( x , recordPtr(x) ) { i = ⌈ log(n) ⌉ // Using last i bits in hash value k = h(x); // h(x) = RandomNumGen(x) m = k % 2i ; // Last i bits = Linear hash function value /* --------------------------------------------------- Insert search key (x, recordPtr(x)) in "bucket m" --------------------------------------------------- */ if ( m ≤ n−1 ) { /* ========================================= m is a "real" bucket ========================================= */ Read disk block Bucket[ m ]; Insert (x, recordPtr(x)) into Bucket[m] (If overflow, use an overflow block) Write disk block Bucket[ m ]; } else { /* ========================================= m is a "virtual" bucket ========================================= */ m' = Suffix-1(m); // I.e.: m = 1xxxxxxxxxx // m' = 0xxxxxxxxxx // Note: m' is for sure a "real" bucket !!! Read disk block Bucket[ m' ]; Insert (x, recordPtr(x)) into Bucket[m] (If overflow, use an overflow block) Write disk block Bucket[ m' ]; } /* ============================================= Check if we need to add a new bucket ============================================= */ if ( r/(n*γ) > τ ) // Average occupancy > threshold { /* =========================================== Create a new physical hash bucket =========================================== */ Allocate a new disk block (for hash bucket); Bucket[n] = new disk block; // Bucket[n] is now real n' = Suffix-1(n); // Bucket[n'] was used to store // search keys belonging to Bucket[n] /* ================================================ Re-hash all keys in Bucket[n'] into: Bucket[n'] and Bucket[n] ================================================ */ j = ⌈ log(n+1) ⌉ ; // The range of hash bucket index is now [0..n] // j = number of binary digits to express n for ( every search key k ∈ Bucket[ n'] ) do { if ( (last j bits of k) == n ) { move search key k into the new Bucket[n]; (Allocate overflow blocks if necessary) } } n++; // One more "real" hash bucket } } |