Re-writing the remove( ) algorithm of a hash table using AVAILABLE entries

  • We revisit the remove(k) algorithm:

        public V remove(K k)  // Return the value associated with key k
        {
            int hashIdx = hashValue(k);
            int i = hashIdx;
    
            do
            {
                if ( bucket[i] == null ) // Is bucket empty ?
                {
                    return null;       // Key k is not in hash table
                }
                else if (bucket[i].key == k ) // Does bucket contains key k ?
                {
                    V retVal = bucket[i].value;
                    bucket[i] = null;         // Delete the entry 
                    return retVal;            // Return value
                }
                i = (i + 1)%capacity;  // Check in next hash table bucket
    
            } while ( i != hashIdx ); // All entries searched !
    
            return null;  // Not found
        }
    

    We will modify it to use the special AVAILABLE entry

Re-writing the remove( ) algorithm of a hash table using AVAILABLE entries

  • If we find the entry containing key k, we replace it with AVAILABLE:

        public V remove(K k)  // Return the value associated with key k
        {
            int hashIdx = hashValue(k);
            int i = hashIdx;
    
            do
            {
                if ( bucket[i] == null ) // Is bucket empty ?
                {
                    return null;       // Key k is not in hash table
                }
                else if (bucket[i].key == k ) // Does bucket contains key k ?
                {
                    V retVal = bucket[i].value;
                    bucket[i] = AVAILABLE;    // Delete the entry 
                    return retVal;            // Return value
                }
                i = (i + 1)%capacity;  // Check in next hash table bucket
    
            } while ( i != hashIdx ); // All entries searched !
    
            return null;  // Not found
        }
    

     

Re-writing the remove( ) algorithm of a hash table using AVAILABLE entries

  • Then, we must also update the search algorithm when bucket[i] == AVAILABLE:

        public V remove(K k)  // Return the value associated with key k
        {
            int hashIdx = hashValue(k);
            int i = hashIdx;
    
            do
            {
                if ( bucket[i] == null ) // Is bucket empty ?
                {
                    return null;       // Key k is not in hash table
                }
                else if ( bucket[i] == AVAILABLE )
               {bucket[i] == AVAILABLE  ???
                    // DO NOT TEST bucket[i] !!! But we need to continue...
                }
                else if (bucket[i].key == k ) // Does bucket contains key k ?
                {
                    V retVal = bucket[i].value;
                    bucket[i] = AVAILABLE;    // Delete the entry 
                    return retVal;            // Return value
                }
                i = (i + 1)%capacity;  // Check in next hash table bucket
    
            } while ( i != hashIdx ); // All entries searched !
    
            return null;  // Not found
        }
    

Re-writing the remove( ) algorithm of a hash table using AVAILABLE entries

  • The search algorithm must skip the bucket[i] == AVAILABLE and continue:

        public V remove(K k)  // Return the value associated with key k
        {
            int hashIdx = hashValue(k);
            int i = hashIdx;
    
            do
            {
                if ( bucket[i] == null ) // Is bucket empty ?
                {
                    return null;       // Key k is not in hash table
                }
                else if ( bucket[i] == AVAILABLE )
                {
                    // DO NOT TEST bucket[i] !!! But we need to continue...
                }
                else if (bucket[i].key == k ) // Does bucket contains key k ?
                {
                    V retVal = bucket[i].value;
                    bucket[i] = AVAILABLE;    // Delete the entry 
                    return retVal;            // Return value
                }
                i = (i + 1)%capacity;  // Check in next hash table bucket
    
            } while ( i != hashIdx ); // All entries searched !
    
            return null;  // Not found
        }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • The get(k) algorithm of Linear Probing before using the special AVAILABLE entry:

       public V get(K k)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
    
           do
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       return null;   // NOT found 
               }
    
    
    
    
               else if (entry[i].key == k )  // FOUND 
    	   {
    	       return bucket[i].value;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           }  while ( i != hashIdx ) // All entries searched
    
           return null;  // NOT found 
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • Get( ) must also skip when bucket[i] == AVAILABLE and continue with the search:

       public V get(K k)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
    
           do
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       return null;   // NOT found 
               }
               else if ( bucket[i] == AVAILABLE )
               {
                   // DO NOT TEST bucket[i] !!! But we need to continue...
               }
               else if (entry[i].key == k )  // FOUND 
    	   {
    	       return bucket[i].value;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           }  while ( i != hashIdx ) // All entries searched
    
           return null;  // NOT found 
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • The put(k,v) algorithm of Linear Probing before using the special AVAILABLE entry:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
    
           do
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       bucket[i] = new Entry<>(k,v);
    	       return;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
           System.out.println("Full");
       }
    

  • The put( ) algorithm is a bit more complicited to modify....

Caveat:   put(k, v) can mean (1) insert or (2) update !  

  • Consider the operation put(P, ..) in the following hash table content:

                         Hash value
        put(P)		  4 
                                      P is not in the hash table 
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R | V | S |   |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
    

  • The operation put(P, ..) will insert a new entry into the hash table:

                         Hash value
        put(P)		  4 
    
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R | V | S | P |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
                                              ^
    					  |
    				    insert (P, ..)
    

Caveat:   put(k, v) can mean (1) insert or (2) update !

  • On the other hand, when P is found in the hash table:

                         Hash value
        put(P)		  4 
                                      P is in the hash table 
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R | V | S | P |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
    

  • The operation put(P, ..) will update the corresponding value of the entry P:

                         Hash value
        put(P)		  4 
    
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R | V | S | P |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
                                              ^
    					  |
    				    update value in bucket
    

Another caveat:   the AVAILABLE entry

  • Consider the operation put(P, ..) in the following hash table content:

                         Hash value
        put(P)		  4 
                                      P is not in the hash table 
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R || S |   |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
    
                            • = AVAILABLE
    

  • Although it is possible to insert P in an empty bucket:

                         Hash value
        put(P)		  4 
    
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R || S | P |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
                                              ^
    					  |
    	                     Insert (P, ..) in empty bucket
    

Another caveat:   the AVAILABLE entry

  • Consider the operation put(P, ..) in the following hash table content:

                         Hash value
        put(P)		  4 
                                      P is not in the hash table 
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R || S |   |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
    
                            • = AVAILABLE
    

  • It is preferable to insert P in an AVAILABLE bucket: (more efficient)

                         Hash value
        put(P)		  4 
    
                  0   1   2   3   4   5   6   7   8   9  10  11
                +---+---+---+---+---+---+---+---+---+---+---+---+
     bucket[] = |   |   |   |   | R | P | S |   |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+
                                      ^
    			          |
    	        Insert (P, ..) in AVAILABLE bucket
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • The original put(k,v) algorithm:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
    
           do
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       bucket[i] = new Entry<>(k,v);
    	       return;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
    
               System.out.println("Full");
    
    
       }
    

  • We will revise put( ) so it makes use of AVAILABLE entries

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • I will first add spaces to the necessary places in the program

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
    
    
           do
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       bucket[i] = new Entry<>(k,v);
    
    
    
    	       return;
               }
    
    
    
    
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
    
               System.out.println("Full");
    
    
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • Put( ) must remember the first AVAILABLE entry - we use the firstAvail variable:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
           int firstAvail = -1; // -1 means: no AVAILABLE entry found (yet)
    
           do
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       bucket[i] = new Entry<>(k,v);
    
    
    
    	       return;
               }
    
    
    
    
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
    
               System.out.println("Full");
    
    
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • When the search for the key k and finds an AVAILABLE entry, we remember its index:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
           int firstAvail = -1; // -1 means: no AVAILABLE entry found (yet)
    
           do     // Search for key k in the hash table
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       bucket[i] = new Entry<>(k,v);
    
    
    
    	       return;
               }
               else if ( bucket[i] == AVAILABLE ) 
               {
                   if ( firstAvail == -1 )  firstAvail = i;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
    
               System.out.println("Full");
    
    
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • When the search for the key k ends with null and no AVAILABLE bucket was we found:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
           int firstAvail = -1; // -1 means: no AVAILABLE entry found (yet)
    
           do     // Search for key k in the hash table
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       if ( firstAvail == -1 )  // No AVAILABLE bucket found
                       bucket[i] = new Entry<>(k,v);
                        // Insert (k,v) in this empty bucket 
    
    	       return;
               }
               else if ( bucket[i] == AVAILABLE ) 
               {
                   if ( firstAvail == -1 )  firstAvail = i;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
    
               System.out.println("Full");
    
    
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • If we found an AVAILABLE bucket during the search for the key k:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
           int firstAvail = -1; // -1 means: no AVAILABLE entry found (yet)
    
           do     // Search for key k in the hash table
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       if ( firstAvail == -1 )  // No AVAILABLE bucket found
                       bucket[i] = new Entry<>(k,v);
                   else // An AVAILABLE bucket found
                       bucket[firstAvail] = new Entry<>(k,v); 
    	       return;
               }
               else if ( bucket[i] == AVAILABLE ) 
               {
                   if ( firstAvail == -1 )  firstAvail = i;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
    
               System.out.println("Full");
    
    
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • When the search ends with i == hashIdx and no AVAILABLE bucket was found:

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
           int firstAvail = -1; // -1 means: no AVAILABLE entry found (yet)
    
           do     // Search for key k in the hash table
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       if ( firstAvail == -1 )  // No AVAILABLE bucket found
                       bucket[i] = new Entry<>(k,v);
                   else // An AVAILABLE bucket found
                       bucket[firstAvail] = new Entry<>(k,v); 
    	       return;
               }
               else if ( bucket[i] == AVAILABLE ) 
               {
                   if ( firstAvail == -1 )  firstAvail = i;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
           if ( firstAvail == -1 )
               System.out.println("Full");
    
    
       }
    

Revising the get algorithm in Open Addressing using AVAILABLE entries

  • If we found an AVAILABLE bucket during the search for the key k, we can insert (k,v):

       public void put(K k, V v)
       {
           int hashIdx = H(k);  // Find the hash index for key k
           int i = hashIdx;
           int firstAvail = -1; // -1 means: no AVAILABLE entry found (yet)
    
           do     // Search for key k in the hash table
           {
               if ( entry[i] == null ) // Is entry empty ?
    	   {
    	       if ( firstAvail == -1 )  // No AVAILABLE bucket found
                       bucket[i] = new Entry<>(k,v);
                   else // An AVAILABLE bucket found
                       bucket[firstAvail] = new Entry<>(k,v); 
    	       return;
               }
               else if ( bucket[i] == AVAILABLE ) 
               {
                   if ( firstAvail == -1 )  firstAvail = i;
               }
               else if (entry[i].key == k ) // Does entry contains key k ?
    	   {
    	       bucket[i].value = v;
    	       return;
               }
    	   i = (i + 1)%M;  // Check in next hash table entry
    
           } while ( i != hashIdx ) // All entries searched !
    
           if ( firstAvail == -1 )
               System.out.println("Full");
           else
               bucket[firstAvail] = new Entry<>(k,v); 
       }
    

Demo program - shows re-using AVAILABLE bucktes

    public static void main(String[] args)
    {
       Dictionary H = new HashTableLinProbe<>(5);


       H.put("ice", "cold");
       H.put("fire", "hot");
       H.put("rock", "hard");
       H.put("wool", "soft");
       H.put("sun", "hot");

       System.out.println("\n*** Test remove(): ***");
       System.out.println("-- rock:" + H.remove("rock"));
       System.out.println("-- ice:" + H.remove("ice"));
       System.out.println("-- wool:" + H.remove("wool"));

       System.out.println("\n\n*** Test get(): ***"); // Search skips AVAILABLE  
       System.out.println("get(sun): " + H.get("sun"));
       System.out.println("get(fire): " + H.get("fire"));
       System.out.println("get(abc): " + H.get("abc"));

       System.out.println("\nTest put():");  // Re-use AVAILABLE
       H.put("ice", "** cold **");
       H.put("sun", "** bright **");
       System.out.println("get(sun): " + H.get("sun"));
    }

DEMO: 15-hashing/21-open-address+delete/Demo2.java

Postscript: clustering in Linear Hashing

  • Suppose the hash table currently stores the entries as follows:

                  0   1   2   3   4   5   6   7   8   9  10  11  12  13
                +---+---+---+---+---+---+---+---+---+---+---+---+---+---+
     entry[] =  |   | A | B | C | D | E | F | G | H |   |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    

  • Then a key k that hashes to a hash value in the range [1..9] will be store in bucket 9:

      Key k     Hash value:   1 ≤ H(k) ≤ 9
    
                  0   1   2   3   4   5   6   7   8   9  10  11  12  13
                +---+---+---+---+---+---+---+---+---+---+---+---+---+---+
     entry[] =  |   | A | B | C | D | E | F | G | H | k |   |   |   |   |
                +---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    

    The phenomenom is called search key clustering

  • The alleviate clustering, other rehashing methods can be used: (see: )

    • Quadratic Probing
    • Double hashing