Disjoint Set Union (Union-Find) Complete Guide

Data Structure • extra hard

Union-Find (Disjoint Set Union) tracks which items are in the same connected component.

It is extremely efficient for repeated connectivity checks with dynamic unions.

Path compression + union by rank/size makes operations nearly constant in practice.

Typical Complexity Baseline

Metric	Value
Complexity Note 1	Near `O(1)` amortized

See It

Concept Diagram

A small animated SVG that illustrates the core idea the moving parts you need to keep in your head when you read the rest of the guide.

Union-find merge two sets

find returns each set's root. union attaches the smaller-rank root under the larger one.

Step 1/157 elements, all in their own singleton set. parent[i] = i.

Speed

Watch First

Video Explainer

A quick visual walkthrough before the detailed sections pick the format that helps you learn fastest.

Union Find in 5 minutes Data Structures & Algorithms

Channel: Potato Coders

Watch on YouTube →

Mechanics

Core Concepts

The building blocks and terminology you need before everything else clicks. Skim or deep-read.

Parent array

What it is Points each node to representative parent.

Why it matters Parent array is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Find

What it is Resolve component representative.

Why it matters Find is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Union

What it is Merge two components.

Why it matters Union is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Rank/size

What it is Heuristic for shallow trees.

Why it matters Rank/size is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs.

Path compression

What it is Flattens trees during find operations. Rewire node directly to root during find.

Why it matters Path compression is a required building block for understanding how Disjoint Set Union (Union-Find) stays correct and performant on large inputs. Rewire node directly to root during find. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Representative

What it is Canonical root id for component.

Why it matters Canonical root id for component. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Union by rank

What it is Attach smaller-depth tree under larger-depth root.

Why it matters Attach smaller-depth tree under larger-depth root. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Amortized complexity

What it is Average cost across operation sequence.

Why it matters Average cost across operation sequence. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Disjoint sets

What it is Non-overlapping groups of elements.

Why it matters Non-overlapping groups of elements. In Disjoint Set Union (Union-Find), this definition helps you reason about correctness and complexity when inputs scale.

Walkthrough

Putting It All Together

Every frame of the concept diagram, laid out top-to-bottom so you can scroll through the algorithm at your own pace.

Step 1/15

7 elements, all in their own singleton set. parent[i] = i.

Step 2/15

union(1, 2): roots are 1, 2. Different → attach 2 under 1.

Step 3/15

Set {1, 2} formed.

Step 4/15

union(3, 4): attach 4 under 3.

Step 5/15

Three groups now: {1,2}, {3,4}, plus singletons {5}, {6}, {7}.

Step 6/15

union(5, 6): attach 6 under 5.

Step 7/15

Groups: {1,2}, {3,4}, {5,6}, {7}.

Step 8/15

union(2, 4): first find(2). Walk to root → 1. (Then find(4) → 3.) Highlighting find(4) walk.

Step 9/15

Roots 1 ≠ 3 → merge. Attach 3 under 1 (so the {3,4} subtree dangles under 1).

Step 10/15

Set {1, 2, 3, 4} formed in one tree rooted at 1.

Step 11/15

union(1, 5): roots 1 and 5. Attach 5 under 1.

Step 12/15

Now {1, 2, 3, 4, 5, 6} all share root 1.

Step 13/15

union(7, 6): find(7) → 7 (still singleton). find(6) → 5 → 1.

Step 14/15

Attach 7 under 1.

Step 15/15

All 7 elements in one connected component. Each find/union is amortized O(α(n)) with path compression + union by rank.

Tradeoffs

How It Compares

When this technique wins, when it loses, and what to reach for instead.

vs. Graph traversal per query

Choose this when Choose Union-Find for many connectivity checks over many unions.

Tradeoff Traversal better when full path details are required.

vs. Adjacency matrix

Choose this when Choose Union-Find for sparse dynamic edge streams.

Tradeoff Matrix offers O(1) direct edge existence lookup but high memory.

vs. Topological sort

Choose this when Choose Union-Find for undirected connectivity grouping.

Tradeoff Topological sort is for directed acyclic dependency ordering.

In Production

Real-World Stories

Where this shows up at real companies and what the on-call engineer learned the hard way.

Networking platforms

Union-Find style component grouping helps reason about cluster and segment connectivity.

Takeaway Connectivity grouping is a common infrastructure primitive.

GIS tools

Spatial region merging and connectivity analyses use disjoint-set logic.

Takeaway Union-Find is practical for geometric grouping workflows.

Data quality pipelines

Entity resolution systems merge records into clusters of same real-world entity.

Takeaway Union operations model transitive equivalence elegantly.

Code

Implementation

A clean reference implementation in TypeScript, Python, and Go. Switch languages with the toggle.

Maintain parent + rank arrays and compress path on every find.

Complexity: Amortized almost O(1)

Union-Find with path compression

Language

class UnionFind {
  private readonly parentByNode: number[]
  private readonly rankByNode: number[]

  constructor(nodeCount: number) {
    this.parentByNode = Array.from({ length: nodeCount }, (_, index) => index)
    this.rankByNode = new Array<number>(nodeCount).fill(0)
  }

  find(nodeId: number): number {
    if (this.parentByNode[nodeId] !== nodeId) {
      this.parentByNode[nodeId] = this.find(this.parentByNode[nodeId])
    }
    return this.parentByNode[nodeId]
  }
}

Watch Out

Common Problems and Failure Modes

The bugs and edge cases that bite engineers most often. Skim before you ship.

Skipping path compression and losing near-constant behavior.
Not handling out-of-range node ids.
Using recursion for find without stack consideration in some languages.
Confusing rank and size semantics during union.
Expecting Union-Find to return explicit path, not just connectivity.

Pro Tips

Tips and Tricks

Small habits that pay off every time you reach for this pattern.

Use union-find when repeated connectivity checks occur in dynamic grouping problems.
Always implement path compression and union by rank/size for near-constant performance.
Initialize parent pointers clearly for each node/index domain.
Great for cycle detection in undirected graph edge streams.

Pattern Recognition

When to Use

Concrete signals that tell you this is the right pattern both at work and on LeetCode.

Real-system usage signals

Fast connectivity queries in evolving graphs.
Simple implementation with strong practical performance.
Core building block for Kruskal MST and cycle detection.

LeetCode-specific tips (pattern-identification signals)

Identification signal: repeated connectivity queries interleaved with merge/union operations.
Identification signal: you need to know whether two nodes belong to same component quickly many times.
Dynamic graph connectivity with incremental edges is a classic Union-Find trigger.
For Disjoint Set Union (Union-Find) questions, start by naming the core invariant before writing code.
Use the constraint section to set time/space target first, then pick the data structure/algorithm.
Solve one tiny example by hand and map each transition to your variables before implementing.
Run adversarial cases: empty input, duplicates, max-size input, and sorted/reverse patterns when relevant.
During interviews, explain why your approach is the right pattern for this prompt, not just why the code works.

Practice

LeetCode Progression

A curated easy-to-hard problem ladder. Each one reinforces a specific aspect of the pattern.

#	Problem	Difficulty	Typical Complexity
1	Number of Provinces	Medium	`O(n^2)`
2	Redundant Connection	Medium	`O(n alpha(n))`
3	Accounts Merge	Medium	`O(n alpha(n))`
4	Satisfiability of Equality Equations	Medium	`O(n alpha(n))`
5	Number of Connected Components in an Undirected Graph	Medium	`O(V+E)`
6	Most Stones Removed with Same Row or Column	Medium	`O(n alpha(n))`
7	Graph Valid Tree	Medium	`O(V+E)`
8	Min Cost to Connect All Points	Medium	`O(n^2 log n)`
9	Number of Islands II	Hard	`O(k alpha(n))`
10	Swim in Rising Water	Hard	`O(n^2 log n)`