Introduction
Doxygen is a system for generating documentation from source code (API specifications, class diagrams, caller and callee graphs, etc.) that utilizes special comments. For input, many languages are natively supported (C/C++, C#, D, Fortran, IDL, Java, Objective-C, PHP, Python, TCL, VHDL) with others available by extension (Perl, JavaScript, Object Pascal, Visual Basic, MatLab, Pro*C, Assembly, Lua, GLSL Shader, Qt QML, GOB-doc, Prolog, CAPL). The generated output is available in a number of formats, of specific interest to this article are HTML and XML.
When developing a piece of software, the existence of circular dependencies between classes, etc. makes the code fragile to modification and, if they are at the level where it is between binaries, causes build issues. Trawling through the source code to find and fix these issues can be time consuming so this article analyses some project code with the open source tool DeepEnds (Visual Studio extension, NuGet package). Among the options available for DeepEnds are reading Doxygen XML and writing a source file with Doxygen comments, in this article both will be illustrated.
The specific example used for the rest of the article happens to be C++ code.
Setting Up the Problem
The documentation is generated from a batch file that runs Doxygen to produce XML output which is then fed into DeepEnds to produce a source file containing comments for Doxygen to process.
rmdir /s /q doxygen\xml
doxygen.exe Doxyxml
DeepEnds.Console.exe doxygen=Dot\arch.cpp doxygen\xml\dummy.xml
rmdir /s /q doxygen\html
doxygen.exe Doxyfile
The Doxygen runs use different files, the first one sets:
OUTPUT_DIRECTORY = doxygen
REFERENCES_RELATION = YES
GENERATE_XML = YES
XML_OUTPUT = xml
The DeepEnds run creates the source file Dot\arch.cpp from the XML files in the directory doxygen\xml, the file dummy.xml does not actually exist. It uses the default values of the parameters associated with parsing Doxygen XML, these are fine for C++ but may need to be altered for other languages.
The second Doxygen run then creates HTML output by re-parsing the source code and including the output from DeepEnds that was written to Dot\arch.cpp.
A Page of the Doxygen HTML Report
Firstly, a graph showing the dependencies between the namespaces and classes (though there are no classes displayed here) as generated using Dot.
Then a table with the main statistics calculated from the graph and its subgraphs. The first column contains the name of the namespace or class, and the second whether there is a cycle. The next nine columns have formulae based on the number of edges (E
), parts (P
) and nodes (N
), these are discussed in Why Favour the Cyclomatic Number? Specifically, the value at that level in the tree and the sum and maximum over the tree of the three formulas (E+P-N)/N
, E+P-N
and N
. The next two columns are the count of the number of externals that corresponds to the dependencies which form the edges and its maximum value as traverse further down the tree. Then the sum of the source lines of code as the tree is traversed is given followed by the result of fitting a log-normal distribution as detailed in Counting Lines of Code, followed by the maximum in the tree.
Section | Cycle | (E + P - N) / N | E + P - N | N | Externals | SLOC | Probability of SLOC |
Val | Max | Sum | Val | Max | Sum | Val | Max | Sum | Count | Max | Max | Sum | Lower | Exp | Upper | Max |
<font size="2">FEA.FileIO</font> | | 0.00 | 0.56 | 0.56 | 0 | 5 | 5 | 4 | 9 | 24 | 16 | 12 | 60 | 540 | 10 | 24 | 56 | 28 |
<font size="2">FEA.FileIO.Abaqus</font> | | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2 | 2 | 2 | 10 | 6 | 28 | 44 | | 21 | | 21 |
<font size="2">FEA.FileIO.Common</font> | | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 7 | 7 | 7 | 12 | 6 | 40 | 193 | | 27 | | 27 |
<font size="2">FEA.FileIO.Ideas</font> | | 0.56 | 0.56 | 0.56 | 5 | 5 | 5 | 9 | 9 | 9 | 12 | 12 | 54 | 230 | | 22 | | 22 |
<font size="2">FEA.FileIO.Vtk</font> | | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2 | 2 | 2 | 11 | 9 | 60 | 73 | | 28 | | 28 |
If there had been any leaf nodes (in this case classes) at this level, then the next table would have been a list of those classes versus the number of lines of code that they contain as counted by Doxygen. Unfortunately, for C++, this only appears to be the size of the class declaration.
The next table lists the 16 classes outside the FEA::FileIO
namespace that are used by it (as mentioned in the externals count of the previous table).
External dependencies |
FEA.ElementFactory |
FEA.Elements.ElementDefinition |
FEA.ElementSet.ElementVisitor |
FEA.ElementSet.Mesh |
FEA.ElementSet.SetOfElements |
FEA.Field.Base |
FEA.Field.Elemental |
FEA.Field.ElementalFieldVisitor |
FEA.Field.FieldVisitor |
FEA.Field.Nodal |
FEA.Field.NodesElements |
FEA.Field.Types |
FEA.Field.Varying |
FEA.Set.System |
FEA.Surface |
FEA.Topology.ElementHandler |
Followed by a table of classes within the FEA::FileIO
namespace that are used by it and thus form the destinations of the directed edges in the graph.
Internal Dependencies |
FEA.FileIO.Common.FileReader |
FEA.FileIO.Common.FileWriter |
Then a sequence of tables expanding on the previous table to show the underlying class dependencies which form the edges of the graph.
FEA.FileIO.Abaqus | → | FEA.FileIO.Common |
FEA.FileIO.Abaqus.ReadAbaqusInp | → | FEA.FileIO.Common.FileReader |
FEA.FileIO.Abaqus.WriteAbaqusInp | → | FEA.FileIO.Common.FileWriter |
FEA.FileIO.Ideas | → | FEA.FileIO.Common |
FEA.FileIO.Ideas.ReadIdeas | → | FEA.FileIO.Common.FileReader |
FEA.FileIO.Ideas.WriteIdeas | → | FEA.FileIO.Common.FileWriter |
FEA.FileIO.Vtk | → | FEA.FileIO.Common |
FEA.FileIO.Vtk.ReadVtk | → | FEA.FileIO.Common.FileReader |
FEA.FileIO.Vtk.WriteVtk | → | FEA.FileIO.Common.FileWriter |
Finally (and redundantly), the graph is reported as a structure matrix.
Common | \ | | | |
Abaqus | 1 | \ | | |
Ideas | 1 | | \ | |
Vtk | 1 | | | \ |
Discussion
Although the source analysed for this article is C++, the technique is not limited to object code, however there is the issue of the construction of a hierarchy which in the example was formed from the namespaces. An alternative hierarchy may be formed from folder structure, although this is not currently supported by DeepEnds.
It is possible that the long list of languages supported by Doxygen doesn't include the one of interest - perhaps the case is not even language based. For such a problem, it is possible for the user to generate XML using the Doxygen schema (or, more simply, create the XML elements of interest) and then for DeepEnds to generate the report. Writing a bespoke parser also has the advantage of overcoming any limitations of the parsers within Doxygen such as the count of the lines of C++ code that was mentioned above.
It was noted in the introduction that Doxygen natively supports C# and that there is an extension for Visual Basic. DeepEnds itself has Roslyn based parsers for C# and Visual Basic and will decompile .NET assemblies using Mono.Cecil so it is recommended to use those rather than Doxygen for parsing the source code.
Given that the output from DeepEnds can be sufficiently complicated to cause a Doxygen run to hang it can be better to produce a standalone HTML report by supplying an alternative argument to doxygen=Dot\arch.cpp such as report=report.html. This HTML report doesn't have the pictures of the graphs, it is possible to view the graphs in Visual Studio by producing some alternative output using another argument such as graph=graph.dgml.
History
- 2016/11/12: First release