1131. Data Structure - HashMapHash, Load Factor, and Rehashing

Implement HashMap with array and linked list.

1. Concepts in HashMap(or Hash Table)

1.1 Hash Code and Compressor

Hash code is an Integer number (random or nonrandom). In Java every Object has its own hash code. We will use the hash code generated by JVM in our hash function and to compress the hash code we modulo(%) the hash code by size of the hash table. So modulo operator is compressor in our implementation.

1.2 Hash Function

Hash function hashes (converts) a number in a large range into a number in a smaller range. This smaller range corresponds to the index numbers in an array.

private int hashFunc(K key) {
int hashCode = key.hashCode();
int index = hashCode % numBuckets;
return index;
}


1.3 Collision

If multiple keys has same hashCode, then collision occurs. Approaches to solve collision:

• Open Addressing: move to an empty cell. clustering issue may happen.

1.4 Open Addressing Vs Separate Chaining

If open addressing is to be used, double hashing seems to be the preferred system by a small margin over quadratic probing. The exception is the situation in which plenty of memory is available and the data won’t expand after the table is created; in this case linear probing is somewhat simpler to implement and, if load factors below 0.5 are used, causes little performance penalty.
If the number of items that will be inserted in a hash table is unknown when the table is created, separate chaining is preferable to open addressing. Increasing the load factor causes major performance penalties in open addressing, but performance degrades only linearly in separate chaining. When in doubt, use separate chaining. Its drawback is the need for a linked list class, but the payoff is that adding more data than you anticipated won’t cause performance to slow to a crawl.

Load Factor is the ratio of the number of items in a hash table to its size. If the total number of buckets is 10 and 7 of them got filled now, the load factor is 7/10=0.7.

In our implementation whenever we add a key value pair to the Hash Table we check the load factor if it is greater than 0.7 we double the size of our hash table.

2. Implementing HashMap

2.1 Structure of HashMap

An array list contains Hash Nodes. Each node can have none or multiple descendant nodes. They have the same index, but contains different hashcode.

2.2 Common Operations for HashMap

• get(key): returns the value corresponding to the key if the key is present in HashMap
• remove(key): removes the key, value pair
• size(): return the size of the HashMap
• isEmpty(): returns true if size is zero

2.3 Time Complexity

• get: $O(1)$
• add: $O(1)$
• remove: $O(1)$

2.4 HashNode

HashNode is a storage unit for storing date. It has the next attribute pointing to the next hashnode, behaves like a linked list.

public class HashNode<K, V> {
public K key;
public V val;
public HashNode<K, V> next;
public HashNode(K key, V val) {
this.key = key;
this.val = val;
this.next = null;
}
}


2.5 Implementation of HashMap

Generic HashMap.

public class HashMap<K, V> {
// bucketArray is used to store array of chains
private ArrayList<HashNode<K, V>> bucketList;
// Current capacity of array list
private int numBuckets;
// Current size of array list
private int size;

// Constructor (Initializes capacity, size and empty chains.
public HashMap() {
bucketList = new ArrayList<>();
numBuckets = 10;
size = 0;

// Create empty chains
for (int i = 0; i < numBuckets; i++) {
}
}

public int size() {
return size;
}

public boolean isEmpty() {
return size() == 0;
}

// Returns value for a key
public V get(K key) {
// Find head of chain for given key
int bucketIndex = hashFunc(key);

// Search key in chain
}
}

return null;
}

// Adds a key value pair to hash
public void add(K key, V value) {
// Find head of chain for given key
int bucketIndex = hashFunc(key);

// Check if key is already present
return;
}
}

// Insert key in chain
size++;
HashNode<K, V> newNode = new HashNode<K, V>(key, value);
bucketList.set(bucketIndex, newNode);

// If load factor goes beyond threshold, then double hash table size
if ((1.0*size)/numBuckets >= 0.7) {
ArrayList<HashNode<K, V>> tempList = bucketList;
bucketList = new ArrayList<>();
numBuckets = 2 * numBuckets; // double the capacity
size = 0;
for (int i = 0; i < numBuckets; i++) {
}

for (HashNode<K, V> headNode : tempList) { // traverse array
}
}
}
}

// Method to remove a given key
public V remove(K key) {
// Apply hash function to find index for given key
int bucketIndex = hashFunc(key);

// Search for key in its chain
HashNode<K, V> prev = null;
// If Key found
break;
}

// Else keep moving in chain
}

// If key was not there
return null;
}

// Reduce size
size--;

// Remove key
if (prev != null) {
} else {
}

}

// hash function
private int hashFunc(K key) {
int hashCode = key.hashCode();
int index = hashCode % numBuckets;
return index;
}
}


Integer type, implemented with node.

class MyHashMap {
class HashNode {
int key, val;
HashNode next;

HashNode(int key, int val) {
this.key = key;
this.val = val;
}
}
HashNode[] bucket;
/** Initialize your data structure here. */
public MyHashMap() {
bucket = new HashNode[1000000];
}

/** value will always be non-negative. */
public void put(int key, int value) {
int hash = hashFunc(key);
if (bucket[hash] == null) {
bucket[hash] = new HashNode(key, value);
} else {
return;
}
}
} else {
}
}
}

/** Returns the value to which the specified key is mapped, or -1 if this map contains no mapping for the key */
public int get(int key) {
int hash = hashFunc(key);
if (bucket[hash] == null) {
return -1;
} else {
}
}
return -1;
}
}

/** Removes the mapping of the specified value key if this map contains a mapping for the key */
public void remove(int key) {
int hash = hashFunc(key);
if (bucket[hash] == null) {
return;
} else {
HashNode dummy = new HashNode(0, 0);
dummy.next = bucket[hash];
HashNode prev = dummy;
HashNode curr = dummy.next;
while (curr != null) {
if (curr.key == key) {
prev.next = curr.next;
break;
}
curr = curr.next;
prev = prev.next;
}
bucket[hash] = dummy.next;
}
}

private int hashFunc(int key) {
return key % 1000000;
}
}


2.6 Implementation of HashSet

Implemented with node.

public class MyHashSet {
class HashNode {
int key;
HashNode next;

HashNode(int key) {
this.key = key;
}
}

HashNode[] bucket;
public MyHashSet() {
bucket = new HashNode[1000000];
}

int hash = hashFunc(key);
if (bucket[hash] == null) {
bucket[hash] = new HashNode(key);
} else {
return;
}
} else {
}
}
}
}

public void remove(int key) {
int hash = hashFunc(key);
if (bucket[hash] == null) {
return;
} else {
HashNode dummy = new HashNode(0);
dummy.next = bucket[hash];
HashNode prev = dummy;
HashNode curr = dummy.next;
while (curr != null) {
if (curr.key == key) {
prev.next = curr.next;
break;
}
curr = curr.next;
prev = prev.next;
}
bucket[hash] = dummy.next;
}
}

public boolean contains(int key) {
int hash = hashFunc(key);
if (bucket[hash] == null) {
return false;
} else {
return true;
}
}
return false;
}
}

private int hashFunc(int key) {
return key % 1000000;
}
}


Implemented with array.

public class MyHashSet {
int[] arr;
private int capacity = 1000000;

// Initialize your data structure here.
public MyHashSet() {
arr = new int[capacity];
// Create empty chains
for (int i = 0; i < capacity; i++) {
arr[i] = Integer.MIN_VALUE;
}
}

int hash = hashFunc(key);
arr[hash] = key;
}

public void remove(int key) {
int hash = hashFunc(key);
if (arr[hash] == key) {
arr[hash] = Integer.MIN_VALUE;
}
}

public boolean contains(int key) {
int hash = hashFunc(key);
return arr[hash] != Integer.MIN_VALUE;
}

// hash function
private int hashFunc(int key) {
return key % capacity;
}
}