Terminology
-
Hash function
H( ):
maps
a key k to
an integer in the
range [0..(M-1)]
H(k) = integer in the range [0..(M-1)]
|
-
Hash value h
=
the value returned by
the
hash function
H( )
-
Bucket
=
the array element
used to store
an entry of
the dictionary
|
Likelihood (probability)
of a collision...
Likelihood (probability)
of a collision...
Likelihood (probability)
of a collision...
Likelihood (probability)
of a collision...
Likelihood (probability)
of a collision...
Likelihood (probability)
of a collision...
Likelihood (probability)
of a collision in
hashing with
hash table size
M
- Question:
- If there are n entries
in a hash table of
size M,
how likely is it that
2 entries
hash into the
same bucket ?
|
Answer:
Prob[ all n entries use different buckets ] =
M x (M-1) x ... x (M-n+1) M!
= ----------------------------- = --------------
M x M x ... x M Mn x (M-n)!
Therefore:
Prob[ 2 entries use the same bucket ] =
M!
= 1 - -------------
Mn x (M-n)!
|
|
Handling collisions in hashing
- There are
2 techniques to
handle
collision in
hashing:
- (1) Closed addressing
(a.k.a: Seperate chaining)
- Entries are
always
stored in their
hash bucket
- Each
bucket of the
hash table is
organized as
a
linked list
|
Example:
|
|
Handling collisions in hashing
- There are
2 techniques to
handle
collision in
hashing:
- (2) Open Addressing
- Entries can be
stored in a
different bucket
than their
hash bucket
- A
rehash algorithm
is used to find an
empty bucket
|
Example:
|
|
❮
❯