DFS Algorithm Study Guide for Beginners
What is DFS (Depth-First Search)?
DFS is a graph traversal algorithm that explores as far as possible along each branch before backtracking.
Think of it like exploring a maze by always taking the first available path and going as deep as possible
before turning back.
How the Code Works - Step by Step
1. Data Structures Used
Graph: Stored as adjacency list using HashMap<String, List<String>>
Color: Tracks vertex states (WHITE=unvisited, GREY=processing, BLACK=finished)
Discovery Time: When we first visit a vertex
Finish Time: When we complete processing a vertex
Predecessor: Parent vertex in the DFS tree
Time: Global counter that increments with each discovery/finish
2. Three-Color System
WHITE (0): Vertex hasn't been discovered yet
GREY (1): Vertex is discovered but still being processed (we're exploring its neighbors)
BLACK (2): Vertex and all its descendants are completely processed
3. Algorithm Flow
Initialization Phase ( dfs() method):
1. Set all vertices to WHITE (unvisited)
2. Set all predecessors to null
3. Set all times to -1
4. Reset global time to 0
Main Loop:
1. Go through each vertex in alphabetical order
2. If a vertex is WHITE (unvisited), call dfsVisit() on it
3. This creates a DFS forest (multiple trees if graph is disconnected)
DFS Visit Phase ( dfsVisit() method):
1. Mark GREY: Color the vertex grey (currently processing)
2. Record Discovery: Increment time and record when we discovered this vertex
3. Explore Neighbors: Look at all adjacent vertices
If neighbor is WHITE: Set current vertex as its predecessor and recursively visit it
If neighbor is GREY or BLACK: Skip it (already processed or being processed)
4. Mark BLACK: After exploring all neighbors, color the vertex black
5. Record Finish: Increment time and record when we finished with this vertex
4. Why This Works
Recursive Nature:
DFS naturally uses recursion (or stack)
When we find an unvisited neighbor, we immediately explore it completely
Only after finishing with a neighbor do we continue with other neighbors
Backtracking:
When we reach a vertex with no unvisited neighbors, we "backtrack"
This happens automatically when recursive calls return
Time Tracking:
Discovery time: When we first encounter a vertex
Finish time: When we're done with a vertex and all its descendants
These times help us understand the traversal order and detect cycles
5. Execution Example with Our Graph
Starting with vertices in order: S, A, B, C, D, E, F, G
1. Start with S (first vertex):
S discovered (time 1)
Visit S's neighbors: A, B, C, D (in order)
A discovered (time 2)
Visit A's neighbors: B, C
B discovered (time 3)
Visit B's neighbor: S (already GREY, skip)
B finished (time 4)
Visit C discovered (time 5)
Visit C's neighbor: B (already BLACK, skip)
C finished (time 6)
A finished (time 7)
Continue with S's remaining neighbors: B(BLACK), C(BLACK), D
D discovered (time 8)
Visit D's neighbors: C(BLACK), E
E discovered (time 9)
Visit E's neighbor: C (already BLACK, skip)
E finished (time 10)
D finished (time 11)
S finished (time 12)
2. Continue with F (next WHITE vertex):
F discovered (time 13)
Visit F's neighbors: D(BLACK), E(BLACK), G
G discovered (time 14)
Visit G's neighbor: C (already BLACK, skip)
G finished (time 15)
F finished (time 16)
6. Key Concepts to Remember
DFS Tree/Forest:
The predecessor relationships form a tree (or forest if disconnected)
Each DFS call from a WHITE vertex creates a new tree
Edge Classification:
Tree edges: Edges in the DFS tree (to WHITE vertices)
Back edges: Edges to GREY ancestors (indicate cycles)
Forward edges: Edges to BLACK descendants
Cross edges: All other edges
Applications:
Finding connected components
Cycle detection
Topological sorting
Finding strongly connected components
Maze solving
7. Common Mistakes to Avoid
Forgetting to increment time properly
Not handling the three-color system correctly
Confusing discovery and finish times
Not properly implementing the recursive backtracking
8. Practice Tips
1. Trace by hand: Follow the algorithm step-by-step on small graphs
2. Draw the DFS tree: Visualize the predecessor relationships
3. Track times carefully: Make sure you understand when time increments
4. Understand the colors: Know what each color represents at each step
This implementation closely follows the standard DFS pseudocode and provides a solid foundation for
understanding how DFS works in practice.