Contents
Introduction
As software developers, it is essentially our job to try and familiarize ourselves with the tips/tricks that make our development easier. Patterns are one area that can aid in the development process. This article will discuss a Design Pattern called the "Visitor" pattern. Now, some of you may know about the Visitor already, but for those that don't, this article may be of use.
I also want to mention the reason I am choosing to write about the Visitor pattern right now. The reason to that actually lies in the next article that I will be writing, which will have something to do with extending LINQ. You see, I didn't want to embark on a LINQ Extension mission before I had covered what I would consider to be a fundamental building block in the way certain parts of LINQ work under the surface.
To this end, this article will describe both the standard Visitor pattern and a somewhat better, more flexible solution.
Standard Visitor Pattern
For those of you that don't know what the Visitor pattern is all about, this excerpt from Wikipedia does a pretty good job of explaining the pros/cons of the Visitor pattern:
"In object-oriented programming and software engineering, the Visitor Design Pattern is a way of separating an algorithm from an object structure upon which it operates. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures. Thus, using the Visitor pattern helps conformance with the open/closed principle.
In essence, the visitor allows one to add new virtual functions to a family of classes without modifying the classes themselves; instead, one creates a visitor class that implements all of the appropriate specializations of the virtual function. The visitor takes the instance reference as input, and implements the goal through double dispatch.
While powerful, the Visitor pattern does have limitations as compared with conventional virtual functions. It is not possible to create visitors for objects without adding a small callback method inside each class, and the callback method in each of the classes is not inheritable to the level of the new subclass."
-- http://en.wikipedia.org/wiki/Visitor_pattern
So how does that translate into code? Well, a class diagram for a simple implementation looks like this, where the implementation is a number of different cheeses that need to be examined, but one could imagine a better structure that required visiting, such as elements within an SGML document, or parts of an Expression within an Expression body (possibly for an Expression within one of the System.Linq.Expressions Expression
classes... possibly).
The basic idea is that there is some object that implements the Visitor
interface. And there are n-many other objects all of which implement a Visitable
interface which allows the acceptance of a Visitor
type. When a Visitor
type is accepted, a call is made back to Visitor
where the Visit
method within the Visitor
object is passed the current Visitable
object. This mechanism is usually repeated for all items within some structure, such as a List/Tree of Visitable
objects.
The beauty of this pattern is that you can simply visit, and you can be sure the structure of the initial object(s) that are being visited are intact. Thus this pattern is quite well suited to traversing some sort of structure object, such as SGML for example.
Let's examine some code.
Let's start with looking at some objects that implement a common Visitable
interface. It can be seen that these classes (very simple classes, for the sake of this article) all have a public void Accept(Visitor v) { v.Visit(this); }
, so what happens when the Accept()
method is called is that the correct Visit()
method (the one that matched the called signature) within the Visitor
object is then called.
interface Visitable
{
void Accept(Visitor v);
}
sealed class Wensleydale : Visitable
{
public String CheeseName { get { return "This is Wensleydale"; } }
public void Accept(Visitor v) { v.Visit(this); }
}
sealed class Gouda : Visitable
{
public String CheeseName { get { return "This is Gouda"; } }
public void Accept(Visitor v) { v.Visit(this); }
}
sealed class Brie : Visitable
{
public String CheeseName { get { return "This is Brie"; } }
public void Accept(Visitor v) { v.Visit(this); }
}
And this is an implementation of a Visitor
type. In this small example, I have used a simple class, but more commonly, it may be a Form or some sort of crazy DSL creator that implements the Visitor
interface.
class VisitorImplementation : Visitor
{
public void Visit(Wensleydale w) { Console.WriteLine(w.CheeseName); }
public void Visit(Gouda g) { Console.WriteLine(g.CheeseName); }
public void Visit(Brie b) { Console.WriteLine(b.CheeseName); }
}
And here is a small test, to put all this to the test.
class Program
{
static void Main(string[] args)
{
List<Visitable> cheeses = new List<Visitable>()
{
new Wensleydale(),
new Gouda(),
new Brie(),
};
Visitor v = new VisitorImplementation();
cheeses.ForEach(c => c.Accept(v));
Console.ReadLine();
}
}
And here is the output:
Now this all looks pretty cool, so what are the downfalls of this approach? Well, using the standard Visitor
pattern, you need to create a Visitable
class per item you need to Visit, and a small callback code section within the Visitor
implementing object.
This may not sound like much work, but imagine you are working against a huge SGML document with literally thousands of possible tag types, this approach would not be that fun to implement in that case, would it?
So what can we do about it? That is something that I myself asked some time ago. Actually, it was when I was at Uni and we had to write an HTML parser, and we looked into it and found Reflection had the answer. I have decided to resurrect the concept that we used in order to show you another possible approach which would not require the use of so many Visitable
classes and so many callbacks. As I also said, the reason that I am writing this article is that it fits in with another article I will be writing soon.
Reflective Visitor
You have now seen the standard Visitor pattern in all its glory, how about seeing another go at the Visitor pattern, but this time using Reflection?
For this example, I have chosen to create a small Parser/Reflective Visitor that is used to parse and visit a small SGML document, that has limited support. The Tags the SGML will support are SGML_Tag
/H_Tag
/P_Tag
/Pre_Tag
/Title_Tag
/Text_Tag
.
SGML_Tag
is the base class for all other supported tags. SGML_Tag
looks like this:
public class SGML_Tag
{
#region Ctor
public SGML_Tag(String name)
{
this.Name = name;
Children = new List<SGML_Tag>();
}
#endregion
#region Public Properties
public String Name { get; private set; }
public List<SGML_Tag> Children { get; private set; }
public String CodeType
{
get
{
return String.Empty;
}
}
#endregion
#region Public Methods
public void AddChild(SGML_Tag child)
{
this.Children.Add(child);
}
#endregion
#region Overrides
public override string ToString()
{
return String.Format(Name);
}
#endregion
}
Where a typical subclass may look like this:
public class Title_Tag : SGML_Tag
{
#region Ctor
public Title_Tag(String name) : base(name)
{
}
#endregion
#region Public Properties
public String Text { get; set; }
#endregion
}
It can be seen that SGML_Tag
has a list of child elements. This is inline with an SGML structure where we essentially have a tree. The parser will actually create a root SGML_Tag
that will serve as the source of items that need to be visited.
What the Parser is Not
The parser I have written for this task is deliberately dumb, and is by no means a workable parser. The only thing it does is recognize supported tags as outlined above and create a tree of these as the SGML document is parsed. That's it. The reason for this is that this is all that is required in order to illustrate the use of Reflection and the Visitor pattern.
The basic idea is that the input SGML document is parsed into a tree of SGML_Tag
objects that are then visited using Reflection by some other object (the Visitor
in the standard Visitor pattern), where there is always only one root which holds the child SGML_Tag
objects.
Let us have a look at the parser.
public class SimpleGMLParser
{
private SGML_Tag root;
public SimpleGMLParser()
{
root = null;
}
[MethodImpl(MethodImplOptions.Synchronized)]
public SGML_Tag GetParsedTree()
{
return root;
}
[MethodImpl(MethodImplOptions.Synchronized)]
public Boolean Parse()
{
XElement rawHtml = null;
using (StreamReader fileReader = new StreamReader(Program.CurrentFile))
rawHtml = XElement.Parse(fileReader.ReadToEnd());
root = new SGML_Tag(rawHtml.Name.LocalName);
foreach (XElement xmlItem in rawHtml.Elements())
{
DealWithSingleNode(xmlItem, root);
}
return true;
}
private void GetElements(XElement xmlParent, SGML_Tag tagParent)
{
foreach (XElement xmlItem in xmlParent.Descendants())
{
DealWithSingleNode(xmlItem, tagParent);
}
}
private void DealWithSingleNode(XElement xmlItem, SGML_Tag tagParent)
{
if (xmlItem.HasElements)
{
SGML_Tag child = CreateTag(xmlItem.Name.LocalName);
tagParent.Children.Add(child);
GetElements(xmlItem, child);
}
else
{
SGML_Tag child = CreateTag(xmlItem.Name.LocalName);
tagParent.Children.Add(child);
}
}
private SGML_Tag CreateTag(String tag)
{
String methodName = tag.ToLower(); ;
MethodInfo mi;
methodName = methodName.ReplaceAll(new string[] {"\\D"}, "").ToLower();
methodName = methodName.Substring(0, 1).ToUpper()
+ methodName.Substring(1);
methodName = "Make_" + methodName + "_Tag";
mi = (from MethodInfo m in this.GetType().GetMethods(
BindingFlags.Instance | BindingFlags.NonPublic)
where m.Name.Equals(methodName)
select m).SingleOrDefault();
if (mi == null)
return Make_Text_Tag(tag);
else
{
return (SGML_Tag)mi.Invoke(this, new Object[] { tag });
}
}
private Title_Tag Make_Title_Tag(String tag)
{
Title_Tag newTag = new Title_Tag(tag);
return newTag;
}
private H_Tag Make_H_Tag(String tag)
{
int level = Int32.Parse(
tag.ReplaceAll(new string[] { "<","h", ">" }, ""));
H_Tag newTag = new H_Tag(tag, level);
return newTag;
}
private Pre_Tag Make_Pre_Tag(String tag)
{
Pre_Tag newTag = new Pre_Tag(tag);
return newTag;
}
....
....
....
#endregion
}
As I say, it is a very naive implementation of an SGML parser. But that's OK, as that is not the point of this article. All that you need to know is that at the time of the parse process, the root SGML_Tag
property will end up with a tree of parsed SGML_Tag
(s), which are then ready to be visited. To those of you that are interested, it can be seen that the parser also uses Reflection to work out what SGML_Tag
tag(s) to create.
In order to reflectively visit this root SGML_Tag
property and all its associated SGML_Tag
children, we need something that will actually do something with the root SGML_Tag
.
I am using a standard WinForms app, but there is a ViewController
that deals with most of the form interaction. The ViewController
is pretty simple, let's see its code:
public class ViewController
{
private TreeNode root = null;
public ViewController(IView currentView)
{
this.CurrentView = currentView;
this.TreeOfVisitedNodes = this.CurrentView.GetControlByName(
"tvParsedItems") as TreeView;
}
public IView CurrentView { get; private set; }
public TreeView TreeOfVisitedNodes { get; private set; }
public void Run()
{
SimpleParser.SimpleGMLParser parser =
new SimpleParser.SimpleGMLParser();
if (parser.Parse())
{
SGML_Tag rootTag = parser.GetParsedTree();
if (rootTag != null)
{
root = TreeOfVisitedNodes.Nodes.Add("Root Node", rootTag.Name);
TreeOfVisitedNodes.ExpandAll();
this.TraverseVisitableNodes(rootTag, root);
}
}
}
private void TraverseVisitableNodes(SGML_Tag htmlParent, TreeNode treeParent)
{
foreach (SGML_Tag htmlTag in htmlParent.Children)
{
if (htmlTag.Children.Count > 0)
{
TreeNode childNode = Visit(htmlTag, treeParent);
TraverseVisitableNodes(htmlTag, childNode);
}
else
{
Visit(htmlTag, treeParent);
}
}
}
private TreeNode Visit(SGML_Tag tag, TreeNode parent)
{
String methodName = tag.Name.ToLower();
MethodInfo mi;
methodName = methodName.ReplaceAll(
new string[] { "\\D" }, "").ToLower();
methodName = methodName.Substring(0, 1).ToUpper() + methodName.Substring(1);
methodName = "Visit_" + methodName + "_Tag";
mi = (from MethodInfo m in this.GetType().GetMethods(
BindingFlags.Instance | BindingFlags.NonPublic)
where m.Name.Equals(methodName)
select m).SingleOrDefault();
if (mi == null)
return null;
else
{
return (TreeNode)mi.Invoke(this, new Object[] { tag, parent});
}
}
private TreeNode Visit_H_Tag(H_Tag ht, TreeNode parent)
{
TreeNode newParent = parent.Nodes.Add(ht.Name, ht.Name);
TreeOfVisitedNodes.ExpandAll();
Console.WriteLine(String.Format("Visiting H_Tag {0}", ht.ToString()));
return newParent;
}
private TreeNode Visit_Text_Tag(Text_Tag tt, TreeNode parent)
{
TreeNode newParent = parent.Nodes.Add(tt.Name, tt.Name);
TreeOfVisitedNodes.ExpandAll();
Console.WriteLine(String.Format("Visiting Text_Tag {0}", tt.ToString()));
return newParent;
}
....
....
....
}
As you can see, there was no Visitor/Visitable interfaces; instead, we rely on the types themselves to give us a name of a method that should be used to do the Visiting. Now, this approach may not suit everyone, but it is worth having a think to see if this approach would work for you.
Running the app for the following SGML document:
<html>
<p>
<h1>Heading1</h1>
<h2>Heading2</h2>
</p>
<h2>Heading2</h2>
</html>
We end up with the following:
Other Nice Stuff
There is a small extension method that I wrote that I found quite useful for replacing all occurrences of certain values within a string; this is as shown below:
public static class StringExtensionMethods
{
#region String.ReplaceAll(..)
public static String ReplaceAll(this String inputString,
String[] unwanted, String replacement)
{
for (int i = 0; i < unwanted.Length; i++)
{
if (unwanted[i].Equals("\\D"))
{
inputString =StringExtensionMethods.StripNumbers(inputString);
}
else
{
inputString = inputString.Replace(unwanted[i], replacement);
}
}
return inputString;
}
#endregion
#region Private Helper Methods
private static String StripNumbers(string input)
{
Regex regEx = new Regex("[0-9]+");
StringBuilder sb = new StringBuilder();
foreach (char a in input)
{
if (!regEx.IsMatch(a.ToString()))
{
sb.Append(a);
}
}
return sb.ToString();
}
#endregion
}
Conclusion
As you can see, we can simplify the standard Visitor pattern somewhat by using Reflection. Next time, we are going to look at the role the Visitor pattern has in LINQ. In particular, we will examine how the System.Linq.Expressions
namespace makes use of the Visitor pattern.
Anyway, that is pretty much all I wanted to say about this pattern this time, but I hope that if you liked this article, you will want to read more about it and some LINQness in the next article I write.
So What Do You Think?
I would just like to ask, if you liked this article, or think it would be useful, leave a vote and a comment. Many thanks.