Prelude to increasing/decreasing the hash table size

Review:

  • Hash function H( ):   maps a key k to an integer in the range [0..(M-1)]

        H(k) = integer in the range [0..(M-1)]
    

  • Hash value h = the value returned by the hash function H( )

        h = H(k)
    

  • Bucket = the array element used to store an entry of the dictionary

Prelude to increasing/decreasing the hash table size

Review:

  • Commonly used hash function:

        h = H2( H1( k ) )
    
        H1(k) = k.hashCode()
        H2(x) = Math.abs( a*x + b ) ) % p % M    M = array size 
    

  • Always store the entry (k, v) at index h in the array

  • Example: how to store a map (dictionary) using hashing

Consequence of increasing/decreasing the hash table size

  • Due to the dependency of the hash function on the array size M:

        h = H2( H1( k ) )
    
        H1(k) = k.hashCode()
        H2(x) = Math.abs( a*x + b ) ) % p % M    M = array size 
    

    we have the following unfortunate consequence:

    • Changing the array size will also change the hash function

  • What does this mean:

    • The entries stored using the old hash function, cannot be found using the new hash function

  • In other words:

    • When we increase/decrease the hash table size, we must rehash all the entries using the new hash function

Example of error when changing the has table size

  • Suppose we used a has table size = 5 originally:

                +---+---+---+---+---+
     entry[] =  |   |   |   |   |   |
                +---+---+---+---+---+
    

    For simplicity sake, we will use the following hash function:

        h = H2( H1( k ) )
    
        H1(k) = k.hashCode()
        H2(x) = x % M             M = 5 
    

  • Suppose the key A has H1(A) = 17

     A ---> hashValue = 17 % 5 = 2
    
                  0   1   2   3   4
                +---+---+---+---+---+
     entry[] =  |   |   | A |   |   |
                +---+---+---+---+---+
    

Example of error when changing the has table size

  • Suppose we increase the table size = 10:

                +---+---+---+---+---+---+---+---+---+---+
     entry[] =  |   |   |   |   |   |   |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+
    

    Notice the hash function will also change:

        h = H2( H1( k ) )
    
        H1(k) = k.hashCode()
        H2(x) = x % M             M = 10 
    

  • Result: we cannot find the key A in the new hash table:

     A ---> hashValue = 17 % 10 = 7
    
                  0   1   2   3   4   5   6   7   8   9
                +---+---+---+---+---+---+---+---+---+---+
     entry[] =  |   |   | A |   |   |   |   | ? |   |   |
                +---+---+---+---+---+---+---+---+---+---+
    

Naive way to increase/decrese the hash table size

  • Because the hash function changes with the hash table size:

    • We must rehash all the keys and insert into the new hash table

  • A naive Algorithm to double the hash table:

        public void doubleHashTable()
        {
            Entry[] oldBucket = bucket;
    
            // Double the size of the bucket
            bucket = (Entry[]) new Entry[2*oldBucket.length];
            capacity = 2*oldBucket.length;
    
            // Rehash all entries by inserting them in the new hash table
            for ( int i = 0; i < oldBucket.length; i++ )
            {
    	    if ( oldBucket[i] != null && oldBucket[i] != AVAILABLE )
                    this.put( oldBucket[i].key, oldBucket[i].value );
            }
        }
    

DEMO: 15-hashing/30-dyn-hashing/Demo.java + HashTableLinProbe.java

Dynamic hashing techniques and comclussion

  • There are sophisticated dynamic hashing techniques:

    1. Extendible hashing:

    2. Linear (extendible) hashing:

  • These are very advanced topics taught in CS554 Advance Database Systems

    They are outside the scope of CS171


     

    • This is the end of the CS171 course material

    • Hope you have learned a lot in the course.

    • Please do the course evaluation