Introduction
This article is about basic analysis of code in a CodeDOM,
such as calculating metrics on it and searching it. Sources are included. This is Part 8 of a series on
CodeDOMs. In the previous parts, I’ve discussed
CodeDOMs, provided a C#
CodeDOM, a WPF UI and IDE, a C# parser,
solution/project
CodeDOM classes, covered loading type metadata, and handled
resolving symbolic references in a
CodeDOM.
The Visitor Pattern
Calculating metrics on or searching a CodeDOM tree (or a sub-tree)
both require traversing the objects in the tree. This requires specialized knowledge of each
type of object, because different types have different fields representing
children objects – a BinaryOperator
has Left
and Right
children, an If
statement has a child Expression
, etc.
In order to avoid writing specific code to visit children objects more
than once, the Visitor design pattern has been employed. This was done by creating an IVisitor
interface and adding an abstract void Accept(IVisitor visitor)
method to
CodeObject
which is overridden by all subclasses as necessary (all of them that define
child objects). To visit nodes in a tree
for a specific purpose, a class is declared that implements IVisitor
,
and this class is instantiated and passed to the Accept()
method of the
top-most object of the desired tree or sub-tree.
The IVisitor
interface defines about 55 methods
for the more important types in the codeDOM.
Some leaf types are not represented separately (such as concrete
operators, references, and doc comment types) in order to keep the method count
closer to 50 than 300. If specific types must be checked that don’t have their
own method, logic will need to be added to the method for the nearest base type
that uses is
or as
to check the specific type. All methods
of the interface must always be implemented, but types which do not need
special visiting logic can have their methods left blank or just call a default
handling routine. Internally generated
(hidden) code objects aren’t visited (such as generated default constructors, delegate
constructors and BeginInvoke
/EndInvoke
methods, the global extern
alias of CodeUnit
s,
etc). Hidden symbolic references using
the HiddenRef
property are optionally visited, depending upon the
value of the
bool
VisitHiddenRefs
property of the interface.
A new example, VisitTree()
, has been added to the
Nova.Examples project which demonstrates how to implement your own visitor
using the class MyVisitor
.
Calculating Metrics
The most basic metrics on a codebase are just counts of all
of the different types of objects in the code.
For example, projects, files, lines, SLOC (source lines of code – lines with
actual code on them), statements, loops, local variables, literals, etc. More advanced metrics require some
calculations, such as average SLOC per method/class, average conditions per
method/class, average methods per class, average code objects per method/class,
etc.
To implement metrics capability, a Metrics
class has been
added which contains many fields for counting different types of objects in a
CodeDOM, and also some fields for calculated metrics. A MetricsVisitor
class has been created
which implements the IVisitor
interface and contains a Metrics
instance which
is used to count the various types of objects as a tree is visited, and is activated
by calling ‘CalculateMetrics(CodeObject)
’,
passing in the starting node. This
method calls Accept()
on the object, which updates all object counts, and then it also calls Calculate()
on the Metrics
instance to update the calculated metrics.
To make this process even easier, a ‘Metrics CalculateMetrics()
’
method has been added to CodeObject
which creates an instance of MetricsVisitor
,
calls CalculateMetrics()
on it, and returns the resulting Metrics
object – thus, metrics can be
calculated on any object in a codeDOM simply by calling CalculateMetrics()
on
it.
When a Solution
, Project
, or CodeUnit
is loaded, if the LoadOptions.LogMetrics
option is specified, the Load()
method will automatically call CalculateMetrics()
on the object and log the
total projects (for solutions), files (for solutions/projects), lines, SLOC,
types, and code objects. The fields of
the Metrics
class also have Description
attributes that describe each metric in detail, and can be used by a UI to
display to the user (and Nova Studio now does this). A new example, CalculateMetrics()
, has been
added to the Nova.Examples project which demonstrates calculating metrics. The usage is very simple, for example:
Metrics metrics = solution.CalculateMetrics();
Log.WriteLine(string.Format("Solution '{0}': {1} projects; {2} files; {3} lines; "
+ "{4} SLOC; {5} types; {6} code objects; {7:N2} code objects per SLOC",
solution.Name, metrics.Projects, metrics.Files, metrics.Lines, metrics.SLOC,
metrics.Types, metrics.CodeObjects, metrics.CodeObjectsPerSLOC));
Searching the CodeDOM
Searching for particular references, types, or text in the
code can now be accomplished with the new classes FindReferences
, FindByType
,
and FindByText
, all of which
implement the IVisitor
interface. A new class, Result
,
is also provided that represents a found CodeObject
and its
associated CodeUnit
. All three of the find classes work on a given
scope (starting CodeObject
),
and produce a collection of Result
objects. The FindReferences
class has various
options, such as including references to members when searching for references
to a type, including derived types, including overrides of virtual members, or
including overloads when searching for a method. The FindByText
class has options to be
case-sensitive, match whole words only, use regular expressions, or to match
only declarations, references, literals, comments, or messages.
New examples (with the same names as the classes) have been
added to the Nova.Examples project which demonstrate how use these three new
classes for searching a CodeDOM. Here
are some example usages:
FindReferences findMethodReferences = new FindReferences(methodDecl, solution);
findMethodReferences.Find();
Log.WriteLine("Found " + findMethodReferences.Results.Count
+ " references to method declaration in the solution");
FindByType findIfs = new FindByType(typeof(If), codeUnit);
findIfs.Find();
Log.WriteLine("Found " + findIfs.Results.Count + " 'if' statements in "
+ codeUnit.Name);
FindByText findText = new FindByText("find1|find2", solution, true, true, true);
findText.Find();
Log.WriteLine("Found " + findText.Results.Count + " objects containing the "
+ "text 'find1' OR 'find2' in the solution");
The FindByType
class has been used to implement a helper method on CodeObject
named GetAllChildren<T>()
which returns an IEnumerable<T>
and can be used to easily iterate over all child objects of a particular type. For example:
var methodDecls = from methodDecl in solution.GetAllChildren<MethodDecl>()
where methodDecl.GetAllChildren<If>().Count() > 2
select methodDecl;
Log.WriteLine(methodDecls.Count() + " MethodDecls have more than 2 'if' statements");
Nova Studio Improvements
Nova Studio now has a Metrics option on the context menu of
the Solution Tree that calculates metrics for the selected solution, project, folder,
or individual file, and displays them in a new Metrics dialog. Tooltips provide detailed descriptions of each
metric. There is also a tool bar icon that calculates
metrics for any file being viewed. The dialog
looks like this:
The context menu of the code window now has Find options
with new dialogs for Find References and Find By Type, and a Find icon on the
tool bar brings up a new Find Text dialog.
These new dialogs allow the scope of the search and various options to
be specified, and the results show in a new Results window at the bottom of
Nova Studio. The dialogs and Results
window look like this:
Using the Attached Source Code
A new Analysis
folder has been added with the new
classes Metrics
,
MetricsVisitor
,
FindByText
,
FindByType
,
FindReferences
,
and Result
. Various analysis-related methods – such as Accept()
– have been added to many of the existing CodeDOM classes, segregated into
regions with a comment of “ANALYSIS”. New
examples have been added to the Nova.Examples project that make use of the new
features: FindReferences()
,
FindByType()
,
FindByText()
,
CalculateMetrics()
and VisitTree()
,
which makes use of a new MyVisitor
class. Also, LINQ queries have been added that make
use of CodeObject.GetAllChildren<T>()
to query all objects of a particular type.
Nova Studio has been improved as mentioned in the previous section. As usual, a separate ZIP file containing
binaries is provided so that you can run them without having to build them
first.
Summary
It’s now possible to easily calculate metrics for a CodeDOM,
create custom code to process a CodeDOM using the Visitor pattern, and search
for text/types/references in a CodeDOM.
Nova Studio has added support for doing such things from the UI. Quite a large feature set has been built up
over this series of articles, resulting in a useful C# CodeDOM. Something that would be nice to add would be
static code analysis to detect possible issues with code and suggest
improvements, but I’m not going to tackle that one just yet. In my next article, I’ll take a look at the
Roslyn project, and see how it compares with the usability, functionality, and
performance of my CodeDOM.