In this post, we will take a look at what are graphs in the context of algorithms and how we apply them to solve such problems?
Sometime back, a colleague of mine asked me about the word ladder problem. She was looking for a change. I believe she stumbled across this while preparing for data structures and algorithms.
Problem Statement
Typically, the puzzle shared is a flavor of what's given below:
Quote:
Find the smallest number of transformations needed to change an initial word to a target word of same length. In every transformation, change only one character and make sure word exists in the given dictionary.
Explanation
Assuming all these 4 letter words are there in the dictionary provided, it takes a minimum of four transitions to convert word from SAIL to RUIN, i.e.
SAIL -> MAIL -> MAIN -> RAIN -> RUIN
The intent here is to know about Graph algorithm
. So, what are graphs in the context of algorithms and how do we apply them to solve such problems?
Graph Data Structure
Graphs are flow structure that represent entities connection with each other. Visually, they are represented with the help of a Node
(Vertex) & an Edge
(Connector).
Quote:
A tree
is an undirected graph in which any two nodes are connected by only one path. In it, each node (except the root node) comprises exactly one parent node.
The most common way to represent a graph is by using an Adjacency matrix
. In it, Element A[i][j]
is 1
if there is an edge from node i
to node j
or else, it is 0
. For example, adjacency matrix of the above unidirected graph is:
| 1 2 3 4
------------
1 | 0 1 0 1
2 | 1 0 1 0
3 | 0 1 0 1
4 | 1 0 1 0
Another common way is via Adjacency list
(List format of the data instead of a matrix).
Related Algorithms
Graphs are applied in search algorithms
. Traversing the nodes and edges in a defined order helps in optimizing search. There are two specific approaches to traverse graph:
Breadth First Search (BFS)
Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G for which there is a path from s. With this approach, it finds all the nodes that are at a distance k from s before it finds any nodes that are at a distance k+1.
Quote:
For easy visualization, think of it as, in a tree, finding all the child nodes for a parent node as first step. Post it, find all the grandchildren and hence forth.
Depth First Search (DFS)
Given a graph G and a starting node s, search proceeds by exploring edges in the graph to finding all the nodes in G traversed from s through its edges. With this approach, we go deep in graph connecting as many nodes in the graph as possible and branch where necessary.
Quote:
For easy visualization, think of it as, in a tree, finding all the family nodes for a parent node. With this, for a given node, we connect its children, grand children, grand grand children and so on before moving to next node of same level.
Thus, with DFS
approach, we can have multiple deduced trees.
Quote:
Knight’s tour is a classic example that leverages Depth First Search algorithm.
Shortest Path First OR Dijkstra’s Algorithm (SPF)
Given a graph G and a starting node s, search the shortest path to reach node d. It uses a concept of weights. It’s an iterative algorithm similar to results of BFS.
Quote:
Many real world example fits in here, e.g. what would be shortest path from home to office.
With BFS
(a simple queue), we visit one node at a time whereas in SPF
(a priority queue), we visit a node at any level with lowest cost. In a sense, BFS follows Dijkstra's algorithm, a step at a time with all edge weights equal to 1. The process for exploring the graph is structurally the same in both cases. at times, BFS is preferred with equal weight graphs. This is because, operations on a priority queue are O(log n)
compared to operations on a regular queue which is O(1)
.
Code
I will be using a breadth first graph algorithm here based on the problem need:
import collections
from collections import deque
class Solution(object):
def ladderLength(self, beginWord,
endWord, wordList)
queue = deque()
queue.append((beginWord, [beginWord]))
while queue:
print('Current queue:',queue)
node, path = queue.popleft()
print('Current transformation count:',
len(path))
for next in self.next_nodes(node,
wordList) - set(path):
if next == endWord:
print('found endword at path:',
path)
return len(path)
else:
queue.append((next,
path + [next]))
return 0
def next_nodes(self, word, word_list):
possiblenodes = set()
wl_word_length = len(word)
for wl_word in word_list:
mismatch_count = 0
for i in range(wl_word_length):
if wl_word[i] != word[i]:
mismatch_count += 1
if mismatch_count == 1:
possiblenodes.add(wl_word)
print('possible next nodes:',possiblenodes)
return possiblenodes
beginWord = "SAIL"
endWord = "RUIN"
wordList = ["SAIL","RAIN","REST","BAIL","MAIL",
"MAIN","RUIN"]
print('Transformations needed: ',
Solution().ladderLength(beginWord,
endWord, wordList))
Quote:
Used deque
(doubly ended queue) of Python
deque
helps with quicker append and pop operations from both the ends. It has O(1)
time complexity for append and pop operations. In comparison, list provides it in O(n)
time complexity.
A quick look at the code workflow to validate if all nodes at a particular distance were traversed first and then moved to the next level:
Current queue: deque([('SAIL', ['SAIL'])])
Current transformation count: 1
possible next nodes: {'BAIL', 'MAIL'}
Current queue: deque([('BAIL', ['SAIL', 'BAIL']),
('MAIL', ['SAIL', 'MAIL'])])
Current transformation count: 2
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIL', ['SAIL', 'MAIL']),
('MAIL', ['SAIL', 'BAIL',
'MAIL'])])
Current transformation count: 2
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('MAIL', ['SAIL', 'BAIL',
'MAIL']),
('BAIL', ['SAIL', 'MAIL',
'BAIL']),
('MAIN', ['SAIL', 'MAIL',
'MAIN'])])
Current transformation count: 3
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('BAIL', ['SAIL', 'MAIL',
'BAIL']),
('MAIN', ['SAIL', 'MAIL',
'MAIN']),
('MAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN'])])
Current transformation count: 3
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'MAIL',
'MAIN']),
('MAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN'])])
Current transformation count: 3
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN']),
('RAIN', ['SAIL', 'MAIL',
'MAIN', 'RAIN'])])
Current transformation count: 4
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('RAIN', ['SAIL', 'MAIL',
'MAIN', 'RAIN']),
('RAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN', 'RAIN'])])
Current transformation count: 4
possible next nodes: {'MAIN', 'RUIN'}
found endword at path: ['SAIL', 'MAIL', 'MAIN',
'RAIN']
Transformations needed: 4
Overall path: ['SAIL', 'MAIL', 'MAIN',
'RAIN', 'RUIN']
Complexity
For the above code that I used to find the shortest path for transformation:
Time
In next_nodes
, for each word in the word list, we iterated over its length to find all the intermediate words corresponding to it. Thus we did M×N
iterations, where M
is the length of each word and N
is the total number of words in the input word list. Further, to form an intermediate word, it takes O(M)
time. This adds up to O(M2×N).
In ladderLength
, BFS can go to each of the N
words and for each word, we need to examine M
possible intermediate words. This adds up to O(M2×N).
Overall, it adds up to O2(M2×N) which would be called O(M2×N).
Space
In next_nodes
, each word in the word list would have M intermediate combinations. For every word, we need a space of M2 to save all the transformations corresponding to it. Thus, it would need a total space of O(M2×N).
In ladderLength
, BFS queue would need a space of O(M×N)
.
Overall, it adds up to O(M2×N)
+ O(M×N) which would be called O(M2×N)
Wrap Up
It could be little tricky and thus would need some practice to visualize the graph as well to write code for it.
Great, so now we know how to solve problems like word ladder problem. It also touch based other related common graph algorithms that we can refer to.
I had a read of the following reference and it has much more details if needed.
Keep problem solving!.