Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / algorithm

How to Solve Word Ladder Problem?

5.00/5 (4 votes)
22 Nov 2020CPOL5 min read 18.6K  
A look into graph based algorithm
In this post, we will take a look at what are graphs in the context of algorithms and how we apply them to solve such problems?

Sometime back, a colleague of mine asked me about the word ladder problem. She was looking for a change. I believe she stumbled across this while preparing for data structures and algorithms.

graph-header

Problem Statement

Typically, the puzzle shared is a flavor of what's given below:

Quote:

Find the smallest number of transformations needed to change an initial word to a target word of same length. In every transformation, change only one character and make sure word exists in the given dictionary.

Explanation

Assuming all these 4 letter words are there in the dictionary provided, it takes a minimum of four transitions to convert word from SAIL to RUIN, i.e.
SAIL -> MAIL -> MAIN -> RAIN -> RUIN

The intent here is to know about Graph algorithm. So, what are graphs in the context of algorithms and how do we apply them to solve such problems?

Graph Data Structure

Graphs are flow structure that represent entities connection with each other. Visually, they are represented with the help of a Node (Vertex) & an Edge (Connector).

graph-general

Quote:

A tree is an undirected graph in which any two nodes are connected by only one path. In it, each node (except the root node) comprises exactly one parent node.

The most common way to represent a graph is by using an Adjacency matrix. In it, Element A[i][j] is 1 if there is an edge from node i to node j or else, it is 0. For example, adjacency matrix of the above unidirected graph is:

  | 1 2 3 4
------------
1 | 0 1 0 1
2 | 1 0 1 0
3 | 0 1 0 1
4 | 1 0 1 0

Another common way is via Adjacency list (List format of the data instead of a matrix).

Related Algorithms

Graphs are applied in search algorithms. Traversing the nodes and edges in a defined order helps in optimizing search. There are two specific approaches to traverse graph:

Breadth First Search (BFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G for which there is a path from s. With this approach, it finds all the nodes that are at a distance k from s before it finds any nodes that are at a distance k+1.

Quote:

For easy visualization, think of it as, in a tree, finding all the child nodes for a parent node as first step. Post it, find all the grandchildren and hence forth.

Depth First Search (DFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to finding all the nodes in G traversed from s through its edges. With this approach, we go deep in graph connecting as many nodes in the graph as possible and branch where necessary.

Quote:

For easy visualization, think of it as, in a tree, finding all the family nodes for a parent node. With this, for a given node, we connect its children, grand children, grand grand children and so on before moving to next node of same level.

Thus, with DFS approach, we can have multiple deduced trees.

Quote:

Knight’s tour is a classic example that leverages Depth First Search algorithm.

Shortest Path First OR Dijkstra’s Algorithm (SPF)

Given a graph G and a starting node s, search the shortest path to reach node d. It uses a concept of weights. It’s an iterative algorithm similar to results of BFS.

Quote:

Many real world example fits in here, e.g. what would be shortest path from home to office.

With BFS (a simple queue), we visit one node at a time whereas in SPF (a priority queue), we visit a node at any level with lowest cost. In a sense, BFS follows Dijkstra's algorithm, a step at a time with all edge weights equal to 1. The process for exploring the graph is structurally the same in both cases. at times, BFS is preferred with equal weight graphs. This is because, operations on a priority queue are O(log n) compared to operations on a regular queue which is O(1).

Code

I will be using a breadth first graph algorithm here based on the problem need:

Python
import collections
from collections import deque 

class Solution(object):
    # method that will help find the path
    def ladderLength(self, beginWord, 
                        endWord, wordList):
        """
        :type beginWord: str
        :type endWord: str
        :type wordList: Set[str]
        :returntype: int
        """

        # Queue for BFS
        queue = deque()

        # start by adding begin word
        queue.append((beginWord, [beginWord]))

        while queue:
            # let's keep a watch at active queue
            print('Current queue:',queue)

            # get the current node and 
            # path how it came
            node, path = queue.popleft()

            # let's keep track of path length 
            # traversed so far
            print('Current transformation count:',
                                        len(path))

            # find possible next set of 
            # child nodes, 1 diff
            for next in self.next_nodes(node, 
                            wordList) - set(path):
                # traversing through all child nodes
                # if any of the child matches, 
                # we are good               
                if next == endWord:
                    print('found endword at path:',
                                            path)
                    return len(path)
                else:
                    # keep record of next 
                    # possible paths
                    queue.append((next, 
                                path + [next]))
        return 0

    def next_nodes(self, word, word_list):
        # start with empty collection
        possiblenodes = set()

        # all the words are of fixed length
        wl_word_length = len(word)

        # loop through all the words in 
        # the word list
        for wl_word in word_list:
            mismatch_count = 0

            # find all the words that are 
            # only a letter different from 
            # current word those are the 
            # possible next child nodes
            for i in range(wl_word_length):
                if wl_word[i] != word[i]:
                    mismatch_count += 1
            if mismatch_count == 1:
                # only one alphabet different-yes
                possiblenodes.add(wl_word)
        
        # lets see the set of next possible nodes 
        print('possible next nodes:',possiblenodes)
        return possiblenodes

# Setup
beginWord = "SAIL"
endWord = "RUIN"
wordList = ["SAIL","RAIN","REST","BAIL","MAIL",
                                    "MAIN","RUIN"]

# Call
print('Transformations needed: ',
    Solution().ladderLength(beginWord, 
                            endWord, wordList))

# Transformation expected == 4
# One possible shortes path with 4 transformation:
# SAIL -> MAIL -> MAIN -> RAIN -> RUIN
Quote:

Used deque (doubly ended queue) of Python

deque helps with quicker append and pop operations from both the ends. It has O(1) time complexity for append and pop operations. In comparison, list provides it in O(n) time complexity.

A quick look at the code workflow to validate if all nodes at a particular distance were traversed first and then moved to the next level:

Current queue: deque([('SAIL', ['SAIL'])])

Current transformation count: 1
possible next nodes: {'BAIL', 'MAIL'}
Current queue: deque([('BAIL', ['SAIL', 'BAIL']), 
                      ('MAIL', ['SAIL', 'MAIL'])])

Current transformation count: 2
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIL', ['SAIL', 'MAIL']), 
                      ('MAIL', ['SAIL', 'BAIL', 
                       'MAIL'])])

Current transformation count: 2
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('MAIL', ['SAIL', 'BAIL', 
                                'MAIL']), 
                      ('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN'])])

Current transformation count: 3
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN']), 
                      ('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN']), 
                      ('RAIN', ['SAIL', 'BAIL', 
                        'MAIL', 'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'MAIN', 'RUIN'}
found endword at path: ['SAIL', 'MAIL', 'MAIN', 
                                        'RAIN']

Transformations needed:  4
Overall path: ['SAIL', 'MAIL', 'MAIN', 
                               'RAIN', 'RUIN']

Complexity

For the above code that I used to find the shortest path for transformation:

Time

In next_nodes, for each word in the word list, we iterated over its length to find all the intermediate words corresponding to it. Thus we did M×N iterations, where M is the length of each word and N is the total number of words in the input word list. Further, to form an intermediate word, it takes O(M) time. This adds up to O(M2×N).

In ladderLength, BFS can go to each of the N words and for each word, we need to examine M possible intermediate words. This adds up to O(M2×N).

Overall, it adds up to O2(M2×N) which would be called O(M2×N).

Space

In next_nodes, each word in the word list would have M intermediate combinations. For every word, we need a space of M2 to save all the transformations corresponding to it. Thus, it would need a total space of O(M2×N).

In ladderLength, BFS queue would need a space of O(M×N).

Overall, it adds up to O(M2×N) + O(M×N) which would be called O(M2×N)

Wrap Up

It could be little tricky and thus would need some practice to visualize the graph as well to write code for it.

Great, so now we know how to solve problems like word ladder problem. It also touch based other related common graph algorithms that we can refer to.

I had a read of the following reference and it has much more details if needed.

Keep problem solving!.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)