Update July 11, 2019: Latest installer added.
Math Editor (the full version) is Free and Open Source. You can get the full source code from https://github.com/kashifimran/math-editor. In this article, I will use a strippped down version, which we will call Math Editor Mini, for pedagogical purpose.
Introduction
An equation or formula editor is a computer software that helps us typeset mathematical content. In this article I will try to provide the readers with a real world application of Object Oriented design and programming techniques as we build our equation editor.
The language used is C# and we are going to use WPF as our GUI framework. However, the techniques provided are NOT dependent on the programming language or the GUI platform and can be applied in any other OO language like Java.
Before I go any further, I would like to express my gratitude to the STIX Fonts Project for creating such a great free font for mathematical & scientific typesetting.
Here is a screenshot of Math Editor:
Background
There are numerous resources available on the Internet and in the form of text books that teach the basic principles like objects, inheritance, polymorphism, encapsulation and so on. My focus will not be on the jargon as I assume that readers of this article are already familiar with the fundamentals of the subject and have some experience programming in an object oriented language like C# or Java.
In this article I primarily intend to address the following two categories of audience
- Programmers who know the OOP basics but are not able to apply the techniques in real world problems
- Experienced programmers who already know all the tricks but want to build an equation editor and want a starting point.
Analysis of Object Oriented Design Process
The best way to learn OOP is to apply the techniques in real programs. Only finishing a few examples given in the tutorials and books is not really sufficient as the examples presented there are usually very superficial. The biggest challenge in OO design and programming is identification of objects and inheritance and delegation of responsibilities to different classes. In a typical example found in text books the objects are almost clearly visible and there is hardly any work needed to define their roles and responsibilities. The following figure is a typical example of such a case:
As we can see, the relationship chain given in the above figure is quite natural and there is hardly any effort required on our part to define it. However, we are not always so fortunate to have that kind of natural inheritance hierarchy in real world problems. For example think of GDI+ or WPF. Do the inheritance chains in those platforms really represent a natural underlying system? Did they just pop up from a real existing hierarchy or you think there was some effort needed to build them the way they are? I am sure the answer is obvious to you!
Sometimes the boundaries are so blurry that we have hard time figuring out how to devise a convenient OO model. Even the best of the best can find themselves quite perplexed at times (ask the MFC people!). The only good strategy in such cases is to give yourself a bit of time and keep looking for the invisible objects and the links between them. You may even need to experiment with a few different models before you decide which one to pick!
Designing the OO Model for the Equation Editor
It is time to turn to the specific case we have decided to tackle and see if we can find our objects and their responsibilities and relationships. Let's first have a real deep look at a couple of equations typeset using Math Editor:
Now try to answer the following questions:
- Do you see any objects that we can use in our OO model for the equation editor?
- Are the lines, letters and the brackets we see going to be our objects as they stand or will we need some other higher or lower level objects distinct from yet capable of representing these figures and letters?
- Is there any specific relationship among these objects? If yes, is that relationship suitable to be represented in an OO Model?
- Are there any common features that could be put in well defined related classes?
- Will it be possible to create a common framework to handle input and output for different kinds of equations? Or will we need different strategy in different components?
Our goal is not the identification of objects, relationships and roles for only the currently visible equation entities. We want to be futuristic. We would like to create a robust model that is natural, flexible, extensible, maintainable and easy-to-understand and not too complex to implement.
Our task list consists of the following:
- Identification of a set of classes that are capable of representing all the currently supported as well as yet-to-be-supported equations.
- Building of a common model that allows us to typeset, serialize and represent equations in a uniform manner.
- Creation of a unified framework for processing input using mouse and keyboard etc.
The task-list looks very short. However, if you give it some thought, you will see that it is very demanding as well as very inclusive. The first task wants us to not only cater for the few equations we can see in the given examples, but it also wants us to create basic support for more kinds of equations we are not yet considering. The second task demands to represent and save our model in a manner that it will be relatively easy for us to convert the equations to some other representation e.g. MathML or TeX when need be. The third task requires to create an input mechanism that suffices the basic needs of all kinds of equations we will ever support.
Now that we have identified some of the most important tasks and goals, let's once again have a look at the couple of equations we saw above and try to find the answers to the questions we asked. Do you see a pattern? Can you answer some or any of the questions? If yes, you have done a great job. But the chances are you will not discover much in just a few minutes!
Please remember that when the concept is more abstract than concrete, creating of a good OO model becomes relatively difficult and is almost always open to debate. There is never a mathematical proof that a particular OO model perfectly fits the given situation. It is more of an art than science. You may need to start over and over again and even sometimes just wait for the divine inspiration! My only advice is to keep looking for what fits best and make your decisions after as much thinking as needed.
I will now try to facilitate you finding the answers we are looking for. Here is a figure in which I represent a few equations as I see them at a lower level ready to be implemented in an OO manner:
From the above figure, we can see a pattern emerging. The following are a few interesting facts we can notice:
- Some of the entities are cyclic in nature i.e. they can host their own relatively smaller form as a container. For example, we can see a division appear inside another division and a bracket inside another bracket with no strict limit to the level of nesting.
- There is an arranging of equations both vertically and horizontally in a repetitive manner.
- The Unicode text either appears inside some other container or inside the top level container.
Understanding the Code
The entire core functionality resides inside just 6 classes. You only need to have a basic understanding of just these core 6 classes in order to able to understand the whole picture. After understanding these classes, you should be able to modify and extend the code according to your needs and wishes. Before I give a brief description of the core classes, let's have a look at the main class hierarchy (the class names in italics represent abstract classes):
1. EquationBase
This is an abstract class. It contains the basic characteristics possessed by every equation we will ever typeset using our equation editor. This class is the ultimate base class of all equations we are going to create. Every other class in the equation model must inherit from this class or one of its descendents.
2. EquationContainer
This class is also abstract. As every equation must, it derives from EquationBase
. This class is the base class of every equation that wants to host other equations inside it (hence the name!). This class is actually the backbone of equation nesting functionality we want implemented.
3. TextEquation
This class allows us to add Unicode text to the document. It is the responsibility of this class to store and draw all textual content. This class is not a container itself (i.e. you cannot nest another EquationBase
inside it), so it directly derives from EquationBase
.
4. EquationRow
This class is the primary container of all the equations that need to be horizontally arranged. It supports both text equations as well as other container equations to nest inside it. All the container equations inside EquationRow
must come between TextEquation
instances. This way we are able to allow the user to enter text and more container equations anywhere they need.
5. RowContainer
Just like the lines of a paragraph or paragraphs of a section in a word processor, equations need to able to be typeset in horizontal lines. RowContainer
is the class which supports that kind of functioanlity. The only equations it ever contains are instances of the class EquationRow
. All other equations are then nested inside those EquationRow
objects.
6. RootEquation
As the name suggests, this class is the first object to be created by the higher level GUI container. This class then creates a single RowContainer
to get the ball rolling. Every subsequent equation is then created inside that single RowContainer
object.
Some Points of Interest
The 6 core classes discussed above construct the entire backbone of the model. From here we are ready to start implementing more useful equations like brackets, divisions, integrals and so on. However, building those kinds of equations from this point on is almost trivial. All the other container classes derive from EquationContainer
and create one or more instances of RowContainer
.
For ease of development, I have placed all the equation classes, including the core classes, inside a folder named equations in the project directory. Apart from the core classes, all the other equations reside in their respective sub-folders.
Implementing Undo/Redo
This is, in fact, one of the most challenging tasks. We could make it very simple, though, by making use of the serialization routines used to save the files. But, that would be too inefficient. The most efficient way seems to let the major equation classes manage their own state and be able to revert back to a previous state when they are told to. Using this approach, however, the process becomes so invovled that we need to build a whole sub-system for this purpose.
The Undo/Redo sub-system is managed by the UndoManager
class, which is responsible for keeping track of and dispatching undo/redo events. The core equations that will take part in the Undo/Redo stack must implement the following tiny interface:
public interface ISupportsUndo
{
void ProcessUndo(EquationAction action);
}
EquationAction
is the base class of all classes that will be passed around when the ProcessUndo(EquationAction action)
method is called by the UndoManager
class. Here's how EquationAction
is defined:
public abstract class EquationAction
{
public bool UndoFlag { get; set; }
public ISupportsUndo Executor { get; set; }
public int FurtherUndoCount { get; set; }
public EquationAction(ISupportsUndo executor)
{
Executor = executor;
UndoFlag = true;
}
}
The UndoFlag
of EquationAction
allows the callee to figure out whether something should be added or removed from the equation tree (i.e. this tells the direction of the undo/redo process!
The good news is that, thanks to our robust OO model, despite all the complexity of the undo stack, only a handful of major equations should be able to manager all undo/redo actions on behalf of every euqation, present or future! In fact, only the following three major equation classes implement the ISupportsUndo
interface:
EquationRow
RowContainer
TextEquation
Beware though, the actual implmentation of Undo/Redo stack inside these classes is no less than a nightmare, as they have to remember the whole equation tree, as it is built and modified, all the time!
Conclusion
Let's summarize what we tried to convey:
- Object Oriented Design process is not a strictly defined set of principles or practices. It is more of an art than science.
- The ability to find common characteristics and behavior in the entities forming a model is the key to good Object Oriented design.
- Abstraction is the cornerstone of flexible and extensible OO model.
- Defining objects and relationships when the actual model to be represented is rather abstract is more challenging.
Please let me know if you find anything missing or unclear. You feedback will definitely help me make my work better and more useful.