1131. Data Structure - HashMap
Hash, Load Factor, and Rehashing


Implement HashMap with array and linked list.

1. Concepts in HashMap(or Hash Table)

1.1 Hash Code and Compressor

Hash code is an Integer number (random or nonrandom). In Java every Object has its own hash code. We will use the hash code generated by JVM in our hash function and to compress the hash code we modulo(%) the hash code by size of the hash table. So modulo operator is compressor in our implementation.

1.2 Hash Function

Hash function hashes (converts) a number in a large range into a number in a smaller range. This smaller range corresponds to the index numbers in an array.

private int hashFunc(K key) {
    int hashCode = key.hashCode();
    int index = hashCode % numBuckets;
    return index;
}

1.3 Collision

If multiple keys has same hashCode, then collision occurs. Approaches to solve collision:

  • Open Addressing: move to an empty cell. clustering issue may happen.
  • Separate Chaining: store values in linked list instead of themselves.

1.4 Open Addressing Vs Separate Chaining

If open addressing is to be used, double hashing seems to be the preferred system by a small margin over quadratic probing. The exception is the situation in which plenty of memory is available and the data won’t expand after the table is created; in this case linear probing is somewhat simpler to implement and, if load factors below 0.5 are used, causes little performance penalty.
If the number of items that will be inserted in a hash table is unknown when the table is created, separate chaining is preferable to open addressing. Increasing the load factor causes major performance penalties in open addressing, but performance degrades only linearly in separate chaining. When in doubt, use separate chaining. Its drawback is the need for a linked list class, but the payoff is that adding more data than you anticipated won’t cause performance to slow to a crawl.

1.5 Load Factor and Rehashing

Load Factor is the ratio of the number of items in a hash table to its size. If the total number of buckets is 10 and 7 of them got filled now, the load factor is 7/10=0.7.

In our implementation whenever we add a key value pair to the Hash Table we check the load factor if it is greater than 0.7 we double the size of our hash table.

2. Implementing HashMap

2.1 Structure of HashMap

An array list contains Hash Nodes. Each node can have none or multiple descendant nodes. They have the same index, but contains different hashcode. image

2.2 Common Operations for HashMap

  • get(key): returns the value corresponding to the key if the key is present in HashMap
  • add(key, value): adds new valid key, value pair to the HashMap, if already present updates the value
  • remove(key): removes the key, value pair
  • size(): return the size of the HashMap
  • isEmpty(): returns true if size is zero

2.3 Time Complexity

  • get: $O(1)$
  • add: $O(1)$
  • remove: $O(1)$

2.4 HashNode

HashNode is a storage unit for storing date. It has the next attribute pointing to the next hashnode, behaves like a linked list.

public class HashNode<K, V> {
    public K key;
    public V val;
    public HashNode<K, V> next;
    public HashNode(K key, V val) {
        this.key = key;
        this.val = val;
        this.next = null;
    }
}

2.5 Implementation of HashMap

Generic HashMap.

public class HashMap<K, V> {
    // bucketArray is used to store array of chains
    private ArrayList<HashNode<K, V>> bucketList;
    // Current capacity of array list
    private int numBuckets;
    // Current size of array list
    private int size;

    // Constructor (Initializes capacity, size and empty chains.
    public HashMap() {
        bucketList = new ArrayList<>();
        numBuckets = 10;
        size = 0;

        // Create empty chains
        for (int i = 0; i < numBuckets; i++) {
            bucketList.add(null);
        }
    }

    public int size() {
        return size;
    }

    public boolean isEmpty() {
        return size() == 0;
    }

    // Returns value for a key
    public V get(K key) {
        // Find head of chain for given key
        int bucketIndex = hashFunc(key);
        HashNode<K, V> head = bucketList.get(bucketIndex);

        // Search key in chain
        while (head != null) {
            if (head.key.equals(key)) {
                return head.val;
            }
            head = head.next;
        }

        // If key not found
        return null;
    }

    // Adds a key value pair to hash
    public void add(K key, V value) {
        // Find head of chain for given key
        int bucketIndex = hashFunc(key);
        HashNode<K, V> head = bucketList.get(bucketIndex);

        // Check if key is already present
        while (head != null) {
            if (head.key.equals(key)) {
                head.val = value;
                return;
            }
            head = head.next;
        }

        // Insert key in chain
        size++;
        head = bucketList.get(bucketIndex);
        HashNode<K, V> newNode = new HashNode<K, V>(key, value);
        newNode.next = head; // add to header
        bucketList.set(bucketIndex, newNode);

        // If load factor goes beyond threshold, then double hash table size
        if ((1.0*size)/numBuckets >= 0.7) {
            ArrayList<HashNode<K, V>> tempList = bucketList;
            bucketList = new ArrayList<>();
            numBuckets = 2 * numBuckets; // double the capacity
            size = 0;
            for (int i = 0; i < numBuckets; i++) {
                bucketList.add(null);
            }

            for (HashNode<K, V> headNode : tempList) { // traverse array
                while (headNode != null) { // traverse each linked list
                    add(headNode.key, headNode.val);
                    headNode = headNode.next;
                }
            }
        }
    }

    // Method to remove a given key
    public V remove(K key) {
        // Apply hash function to find index for given key
        int bucketIndex = hashFunc(key);

        // Get head of chain
        HashNode<K, V> head = bucketList.get(bucketIndex);

        // Search for key in its chain
        HashNode<K, V> prev = null;
        while (head != null) {
            // If Key found
            if (head.key.equals(key)) {
                break;
            }

            // Else keep moving in chain
            prev = head;
            head = head.next;
        }

        // If key was not there
        if (head == null) {
            return null;
        }

        // Reduce size
        size--;

        // Remove key
        if (prev != null) {
            prev.next = head.next;
        } else {
            bucketList.set(bucketIndex, head.next);
        }

        return head.val;
    }

    // hash function
    private int hashFunc(K key) {
        int hashCode = key.hashCode();
        int index = hashCode % numBuckets;
        return index;
    }
}

Integer type, implemented with node.

class MyHashMap {
    class HashNode {
        int key, val;
        HashNode next;

        HashNode(int key, int val) {
            this.key = key;
            this.val = val;
        }
    }
    HashNode[] bucket;
    /** Initialize your data structure here. */
    public MyHashMap() {
        bucket = new HashNode[1000000];
    }

    /** value will always be non-negative. */
    public void put(int key, int value) {
        int hash = hashFunc(key);
        if (bucket[hash] == null) {
            bucket[hash] = new HashNode(key, value);
        } else {
            HashNode header = bucket[hash];
            while (header != null && header.next != null) {
                if (header.key == key) {
                    header.val = value;
                    return;
                }
                header = header.next;
            }
            if (header.key == key) {
                header.val = value;
            } else {
                header.next = new HashNode(key, value);
            }
        }
    }

    /** Returns the value to which the specified key is mapped, or -1 if this map contains no mapping for the key */
    public int get(int key) {
        int hash = hashFunc(key);
        if (bucket[hash] == null) {
            return -1;
        } else {
            HashNode header = bucket[hash];
            while (header != null) {
                if (header.key == key) {
                    return header.val;
                }
                header = header.next;
            }
            return -1;
        }
    }

    /** Removes the mapping of the specified value key if this map contains a mapping for the key */
    public void remove(int key) {
        int hash = hashFunc(key);
        if (bucket[hash] == null) {
            return;
        } else {
            HashNode dummy = new HashNode(0, 0);
            dummy.next = bucket[hash];
            HashNode prev = dummy;
            HashNode curr = dummy.next;
            while (curr != null) {
                if (curr.key == key) {
                    prev.next = curr.next;
                    break;
                }
                curr = curr.next;
                prev = prev.next;
            }
            bucket[hash] = dummy.next;
        }
    }

    private int hashFunc(int key) {
        return key % 1000000;
    }
}

2.6 Implementation of HashSet

Implemented with node.

public class MyHashSet {
    class HashNode {
        int key;
        HashNode next;

        HashNode(int key) {
            this.key = key;
        }
    }

    HashNode[] bucket;
    public MyHashSet() {
        bucket = new HashNode[1000000];
    }

    public void add(int key) {
        int hash = hashFunc(key);
        if (bucket[hash] == null) {
            bucket[hash] = new HashNode(key);
        } else {
            HashNode header = bucket[hash];
            while (header != null) {
                if (header.key == key) {
                    return;
                }
                if (header.next == null) {
                    header.next = new HashNode(key);
                } else {
                    header = header.next;
                }
            }
        }
    }

    public void remove(int key) {
        int hash = hashFunc(key);
        if (bucket[hash] == null) {
            return;
        } else {
            HashNode dummy = new HashNode(0);
            dummy.next = bucket[hash];
            HashNode prev = dummy;
            HashNode curr = dummy.next;
            while (curr != null) {
                if (curr.key == key) {
                    prev.next = curr.next;
                    break;
                }
                curr = curr.next;
                prev = prev.next;
            }
            bucket[hash] = dummy.next;
        }
    }

    public boolean contains(int key) {
        int hash = hashFunc(key);
        if (bucket[hash] == null) {
            return false;
        } else {
            HashNode header = bucket[hash];
            while (header != null) {
                if (header.key == key) {
                    return true;
                }
                header = header.next;
            }
            return false;
        }
    }

    private int hashFunc(int key) {
        return key % 1000000;
    }
}

Implemented with array.

public class MyHashSet {
    int[] arr;
    private int capacity = 1000000;

    // Initialize your data structure here.
    public MyHashSet() {
        arr = new int[capacity];
        // Create empty chains
        for (int i = 0; i < capacity; i++) {
            arr[i] = Integer.MIN_VALUE;
        }
    }

    public void add(int key) {
        int hash = hashFunc(key);
        arr[hash] = key;
    }

    public void remove(int key) {
        int hash = hashFunc(key);
        if (arr[hash] == key) {
            arr[hash] = Integer.MIN_VALUE;
        }
    }

    public boolean contains(int key) {
        int hash = hashFunc(key);
        return arr[hash] != Integer.MIN_VALUE;
    }

    // hash function
    private int hashFunc(int key) {
        return key % capacity;
    }
}

3. Classic Problems

4. Source Files

5. Reference