HashMaps & Dictionaries, Explained Simply
Channel: Nic Barker
Watch on YouTube →Data Structure • easy
Hash maps store key-value pairs and are one of the most important tools for reducing nested loops to single-pass solutions.
They are powered by hash functions that map keys to buckets. Collisions happen and are resolved internally with chaining or probing.
In practice, hash maps are everywhere: caching, joins, deduplication, indexing, rate limits, and identity maps.
Typical Complexity Baseline
| Metric | Value |
|---|---|
| Insert/find average | O(1) |
See It
A small animated SVG that illustrates the core idea the moving parts you need to keep in your head when you read the rest of the guide.
The key is hashed and reduced modulo the bucket count. Collisions chain inside one bucket.
Watch First
A quick visual walkthrough before the detailed sections pick the format that helps you learn fastest.
Channel: Nic Barker
Watch on YouTube →Mechanics
The building blocks and terminology you need before everything else clicks. Skim or deep-read.
What it is A hash function turns a key (like "user:42") into a deterministic integer. The table then maps that integer to a bucket index using modulo by bucket length. Function that maps a key to a deterministic integer value.
Why it matters Distribution quality controls performance. If hashes spread keys evenly, bucket scans stay short and lookup remains close to O(1) average. Good hash functions are fast and distribute keys uniformly. They do not need to be cryptographic for hash map usage, but they must reduce clustering.
What it is Buckets are storage slots in the table. Each bucket holds zero or more entries, usually as a linked list or dynamic array.
Why it matters Bucket count determines collision pressure. Too few buckets means long chains and slower operations. More buckets means better speed but higher memory use.
What it is A collision happens when two different keys land in the same bucket index. Common strategies are separate chaining and open addressing.
Why it matters Correct collision handling is what makes hash maps trustworthy. Without it, inserts overwrite each other and lookups become incorrect.
What it is Load factor is entryCount / bucketCount. It measures how full the table is. Ratio between entries and buckets (n / m).
Why it matters As load factor rises, collisions increase and average lookup cost rises. Most implementations resize before load factor gets too high. Load factor is a control metric for balancing speed and memory. High load means fewer buckets per key and higher collision probability.
What it is Rehashing allocates a larger bucket table and reinserts existing entries so each key lands in its new bucket index. Rebuild table with new capacity and remap all keys.
Why it matters Rehashing keeps long-term performance stable. It is expensive occasionally, but it prevents every operation from getting slower as data grows. Rehashing is usually O(n) for that one resize event, but amortized over many inserts it preserves near O(1) average insertion and lookup.
What it is Two different keys map to the same bucket index.
Why it matters Collisions are normal. Correctness comes from storing key-value pairs and checking full key equality during lookup, not from unique hashes.
What it is One slot in the table that stores one or more entries.
Why it matters Buckets are the first partition step. Average performance depends on keeping per-bucket entry counts small via good hashing and resizing.
Walkthrough
Every frame of the concept diagram, laid out top-to-bottom so you can scroll through the algorithm at your own pace.
Empty map with 8 buckets. We'll insert several keys, force a collision, then look one up.
Insert 'alice'. hash('alice') % 8 = 3.
Place 'alice' in bucket 3.
Insert 'bob'. hash('bob') % 8 = 6.
Place 'bob' in bucket 6.
Insert 'carol'. hash % 8 = 3 collision with 'alice'.
Chain 'carol' onto bucket 3 (separate chaining).
Insert 'dave'. hash % 8 = 0.
Place 'dave' in bucket 0.
Insert 'eve'. hash % 8 = 5.
Place 'eve' in bucket 5.
Insert 'frank'. hash % 8 = 3 another collision in bucket 3.
Bucket 3 chain is now 3 long. As load factor climbs the map will resize.
Lookup 'carol'. hash % 8 = 3.
Walk the chain in bucket 3: alice ≠ carol → carol = carol → found. O(1) average; O(chain length) worst case.
Tradeoffs
When this technique wins, when it loses, and what to reach for instead.
Choose this when Choose hash maps for key-based lookup not tied to numeric index.
Tradeoff Memory overhead is higher than compact arrays.
Choose this when Choose hash maps for exact-key retrieval.
Tradeoff Trie performs better for prefix-heavy workloads.
Choose this when Choose hash maps for fastest average lookup.
Tradeoff Tree maps keep keys sorted and support range queries.
In Production
Where this shows up at real companies and what the on-call engineer learned the hard way.
Redis dictionaries are hash-table based and power core key lookup behavior.
Takeaway Fast key indexing is foundational for low-latency caching.
Order and session services use hash-style key lookup for idempotency and dedup checks.
Takeaway Hash maps help enforce correctness in high-throughput commerce workflows.
Rate-limiting and bot protection paths rely on quick key counters by IP/token signatures.
Takeaway Hash maps turn high-cardinality counting into practical real-time control.
Code
A clean reference implementation in TypeScript, Python, and Go. Switch languages with the toggle.
Build your own bucket table, hash function, collision chain, and resize flow so the structure is explicit rather than hidden behind built-in map classes.
Complexity: Average insert/get O(1), resize O(n), space O(n)
Hash map implementation from scratch (chaining)
type HashNode = { key: string; value: number; next: HashNode | null }
class SimpleHashMap {
private buckets: Array<HashNode | null>
private entryCount = 0
constructor(private capacity = 16) {
this.buckets = new Array<HashNode | null>(capacity).fill(null)
}
private hashKey(key: string): number {
let hashValue = 2166136261
for (let index = 0; index < key.length; index += 1) {
hashValue ^= key.charCodeAt(index)
hashValue = Math.imul(hashValue, 16777619)
}
return (hashValue >>> 0) % this.capacity
}
private resizeIfNeeded(): void {
const loadFactor = this.entryCount / this.capacity
if (loadFactor < 0.75) {
return
}
const oldBuckets = this.buckets
this.capacity *= 2
this.buckets = new Array<HashNode | null>(this.capacity).fill(null)
this.entryCount = 0
for (const bucketHead of oldBuckets) {
let currentNode = bucketHead
while (currentNode) {
this.set(currentNode.key, currentNode.value)
currentNode = currentNode.next
}
}
}
set(key: string, value: number): void {
this.resizeIfNeeded()
const bucketIndex = this.hashKey(key)
let currentNode = this.buckets[bucketIndex]
while (currentNode) {
if (currentNode.key === key) {
currentNode.value = value
return
}
currentNode = currentNode.next
}
this.buckets[bucketIndex] = { key, value, next: this.buckets[bucketIndex] }
this.entryCount += 1
}
get(key: string): number | null {
const bucketIndex = this.hashKey(key)
let currentNode = this.buckets[bucketIndex]
while (currentNode) {
if (currentNode.key === key) {
return currentNode.value
}
currentNode = currentNode.next
}
return null
}
}
function firstDuplicateUsingCustomHashMap(values: number[]): number | null {
const countsByValue = new SimpleHashMap()
for (const value of values) {
const key = String(value)
const previousCount = countsByValue.get(key) ?? 0
const nextCount = previousCount + 1
countsByValue.set(key, nextCount)
if (nextCount > 1) {
return value
}
}
return null
}
Watch Out
The bugs and edge cases that bite engineers most often. Skim before you ship.
O(1); adversarial keys can degrade performance.Pro Tips
Small habits that pay off every time you reach for this pattern.
Pattern Recognition
Concrete signals that tell you this is the right pattern both at work and on LeetCode.
O(1) lookup/insert/delete for key-based access patterns.Practice
A curated easy-to-hard problem ladder. Each one reinforces a specific aspect of the pattern.
| # | Problem | Difficulty | Typical Complexity |
|---|---|---|---|
| 1 | Contains Duplicate | Easy | O(n) |
| 2 | Valid Anagram | Easy | O(n) |
| 3 | Two Sum | Easy | O(n) |
| 4 | Group Anagrams | Medium | O(n*k log k) |
| 5 | Top K Frequent Elements | Medium | O(n log k) |
| 6 | Subarray Sum Equals K | Medium | O(n) |
| 7 | Longest Consecutive Sequence | Medium | O(n) |
| 8 | 4Sum II | Medium | O(n^2) |
| 9 | Minimum Window Substring | Hard | O(n) |
| 10 | LFU Cache | Hard | O(1) amortized |