The TraversalStatePattern efficiently marks and resets node processing states during graph traversals, enabling identification of processed entities without additional collections and supporting multithreading capabilities for concurrent iterations.
Introduction
The TraversalStatePattern
can be used to identify processed entities without requiring additional collections and to reset the IsProcessed
flag after entities have been marked as processed during a full or partial traversal, eliminating the need to iterate through all nodes once again to clear the flag. It can be used in graph traversal or for processing other types of entities.
Background
The idea is to include an integer state value within each Node
, along with a global integer state used for comparison with the Node's state to determine whether the node has been processed. The Node
's state value will be set to the global value to mark it as processed. Before each traversal, we increment the global state number, which effectively resets all nodes to an unprocessed state prior to the traversal.
Example Situation
Example of a graph node to which we want to apply a pattern:
public partial class ExampleNode
{
public readonly int Value;
#region Graph hierarchy
private readonly List<ExampleNode> _children = new();
public IReadOnlyList<ExampleNode> Children => _children;
public void AddChild(ExampleNode child)
{
_children.Add(child);
}
#endregion
}
Pattern Code
By using a partial class
, we can relocate all pattern logic to a separate file:
public partial class ExampleNode
{
private static int _globalTraversalState;
private int _nodeTraversalState = -1;
public static void NewTraversal()
{
_globalTraversalState++;
}
public void MarkProcessed()
{
_nodeTraversalState = _globalTraversalState;
}
public bool IsProcessed() => _nodeTraversalState == _globalTraversalState;
}
We can combine the MarkProcessed()
and IsProcessed()
methods into one. It will only return 'True
' the first time you use it:
public bool CheckSetProcessed()
{
if (_nodeTraversalState == _globalTraversalState)
return true;
_nodeTraversalState = _globalTraversalState;
return false;
}
Pattern Usage
Example graph traversal method:
public void GraphTraversal(ExampleNode root, Action<ExampleNode> procedure)
{
ExampleNode.NewTraversal();
var queue = new Queue<ExampleNode>();
queue.Enqueue(root);
root.CheckSetProcessed();
while (queue.TryDequeue(out var node))
{
foreach (var resultChild in node.Children)
{
if (resultChild.CheckSetProcessed())
{
continue;
}
procedure(resultChild);
queue.Enqueue(resultChild);
}
}
}
Multithreading Usage
For multithreading processing, we should modify the code a bit:
public partial class ExampleNode
{
private static int _globalTraversalState;
private int _nodeTraversalState = -1;
public static void NewTraversal()
{
_globalTraversalState++;
}
public bool CheckSetProcessed()
{
var newState = Interlocked.Exchange
(ref _nodeTraversalState, _globalTraversalState);
return newState == _globalTraversalState;
}
}
Here's an example of how to use it in code that runs in multiple threads. I used BackgroundWorkers
, but you can use other methods like Tasks
:
public void GraphTraversal(
ExampleNode root,
Action<ExampleNode> procedure,
int threads,
int nodes)
{
ExampleNode.NewTraversalMultithreaded();
var waitEvent = new CountdownEvent(threads);
var processingBC = new BlockingCollection<ExampleNode> { root };
for (var i = 0; i < threads; i++)
{
var worker = new BackgroundWorker();
worker.DoWork += (_, _) =>
{
foreach (var exampleNode in processingBC.GetConsumingEnumerable())
{
procedure(exampleNode);
if (Interlocked.Decrement(ref nodes) == 0)
processingBC.CompleteAdding();
foreach (var child in exampleNode.Children)
{
if (child.CheckSetProcessedMultithreaded())
{
continue;
}
processingBC.Add(child);
}
}
};
worker.RunWorkerCompleted += (_, _) =>
{
waitEvent.Signal();
worker.Dispose();
};
worker.RunWorkerAsync();
}
waitEvent.Wait();
}
Parallel Multithreaded Usage
(Under Parallel multithreaded means doing multiple concurrent iterations over the same structure, where each iteration runs using multithreading.)
Git repository has a parallel, multithreaded example of how to use it, but the benchmarks show it's only 5-10% faster than using ConcurrentDictionary
. ConcurrentDictionary way.
Parallel multithreaded way without additional collections, with bits manipulation and limitation of 64 parallel operations: ParallelStatePatternGraphTraversal.cs
NodeParallelUtils.cs (provides thread traversal context for tracking flag bits).
History
- 3rd December, 2023: Initial version