Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Class Diagram Generator

0.00/5 (No votes)
19 Nov 2014 1  
A project driven .NET class diagram generator that allows generation from dll or exe.

Introduction

This article targets class diagram generation and the inability of Visual Studio to export individual or all class diagrams as single files. We will look at Assemblies and how to get information like Fields, Properties, Methods and Events back from them. 

Download EXE

Download Source

Download UnmanagedLibraryHelper.zip

Background

While all developers hate documentation and repetitive tasks it is still a part of our sometimes daily tasks.

Yes there are tools out there that ease the pain like Ghost Doc or Sandcastle Help Builder, but none of them actually generate class diagrams as well. The Code Project too houses a couple of class diagram generators but what makes this one different? Well... I'd say the ability to create a project with multiple entries pointing to dll's or exe's that you wish to dump to class diagrams. This article will describe how I went about analyzing the dll or exe and how one then generates the class diagrams. Another feature of this code is to point it to the code directory and add remarks to the class headers; assisting with CHM generation that contains a visual representation of your class.

Using the code

Analyzing a Managed .NET dll or exe.

[The following code is found in CDGenerator.cs]

        /// <summary>
        /// Analyzes the DLL.
        /// </summary>
        /// <param name="dllLocation">The DLL location.</param>
        /// <param name="excludeServices">if set to <c>true</c> [exclude services].</param>
        /// <param name="excludeCLR">if set to <c>true</c> [exclude color].</param>
        public static void AnalyzeDLL(string dllLocation, bool excludeServices = false, bool excludeCLR = true)
        {
            string assemblyNameSpace = "";
            Type currentType = null;
            Type[] typesFound = null;
            ClassList = new List<ClassInfo>();
            CDAssembly = null;

            try
            {
                if (dllLocation != string.Empty && File.Exists(dllLocation))
                {
                    CDAssembly = Assembly.LoadFrom(dllLocation);
                    if (CDAssembly.FullName != "")
                    {
                        assemblyNameSpace = CDAssembly.FullName.Substring(0, CDAssembly.FullName.IndexOf(","));

                        typesFound = CDAssembly.GetTypes();

                        if (typesFound != null)
                        {
                            foreach (var type in typesFound)
                            {
                                if (type.Namespace != null)
                                {
                                    if (type.IsNotPublic)
                                    {
                                        continue;
                                    }
                                    else
                                    {
                                        //Excludes Meta data classes and only generate for the current namespace
                                        if (!type.FullName.Contains("Metadata") && type.Namespace.Contains(assemblyNameSpace))
                                        {
                                            if (excludeServices && type.FullName.Contains("Service."))
                                            {
                                                continue;
                                            }

                                            //Exclude weird naming conventions.. usually generated classes not coded by a developer
                                            if (!type.FullName.Contains("<") && !type.FullName.Contains(">"))
                                            {
                                                currentType = type;
                                                ClassList.Add(new ClassInfo(Path.GetFileName(dllLocation), type, excludeCLR));
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(string.Format("{0}: {1}", currentType.Name, ex.Message));
                throw ex;
            }
        }

Have a look at the AnalyzeDLL method above. You will see a few helper classes used inside.
ClassInfo: Stores information like Class Type, Name and a List of BasicInfo found inside the dll or exe.
BasicInfo: Stores the info type (e.g. Fields, Properties, Constructor, Methods or Events).

The code starts by setting a couple of variables and doing a couple of checks to ensure we actually passed it a valid location to the dll or exe to analyze.

The actual work of loading the dll or exe into memory is done by the following command:

CDAssembly = Assembly.LoadFrom(dllLocation);

I used Assembly.LoadFrom() to ensure that we don't get any errors with referenced assemblies that possibly lay in the same directory as the loaded file. This is generally the problem when loading a dll or exe with Assembly.Load().

To ensure that the file is actually loaded, a check is done on the assemblies full name after which the namespace is recorded into the assemblyNameSpace variable.

An array of types is extracted from the assembly by using the GetTypes() method on our CDAssembly.

Further checks are done to see that types are actually found and that they are publically accessible.
Metadata is excluded and a check is done so that we only look at methods in the main Namespace.

If all these checks pass the current type is added to the ClassList where a new ClassInfo item is created.

ClassList.Add(new ClassInfo(Path.GetFileName(dllLocation), currentType, excludeCLR));

By looking closer at the ClassInfo constructor you will find that a check is done on the type of class (e.g. Class, Interface, AbstractClass, Enum or Struct).

Once we know what type the class is the GetMoreInfo() method is called. This method finds all the Properties, Fields, Constructors, Methods and Events. These are all found in the same fashion.

Let's have a look at how the Methods are found:

MethodInfo[] methodInfos = referenceType.GetMethods(BindingFlags.Public | BindingFlags.Static | BindingFlags.Instance | BindingFlags.CreateInstance);

An array is built up by checking the assembly type and calling the GetMethods() method. This holds true for Fields(GetFields()), Properties (GetProperties()) etc. Flags are set to Public , Static , Instance or CreateInstance.

if (methodInfos != null && methodInfos.Length > 0)
                {
                    foreach (MethodInfo methodInfo in methodInfos)
                    {
                        //Only add custom methods. Don't show built in DotNet methods
                        if (excludeCLR && methodInfo.Module.ScopeName != ReferenceFileName)
                        {
                            continue;
                        }
                        else
                        {
                            ClassInformation.Add(new BasicInfo(BasicInfo.BasicInfoType.Methods, methodInfo.Name, methodInfo.ReturnType));
                        }
                    }
                }

Once the array has been built up a check is done to check for nulls and that the array actually contains records.
When the check passes we loop through all the methods found whilst checking if the method falls within the scope of the file passed and if built in .NET fields must be excluded. An example of a built in .NET method would be ToString() or GetType().

Generating a Class Diagram


public static void GeneratesClassDiagram(ClassInfo classInfo, string outputPath, bool includeProperties = true, bool includeMethods = true, bool includeEvents = true, bool includeFields = true, bool includeConstructor = true)
        {... }
​

 Now that we have a Class (ClassInfo) with lists of all its Constructor, Fields, Properties,  Methods and Events we can actually generate an class diagram image.
 This is all done by the GeneratesClassDiagram() method contained in the CDGenerator   Class.

 A helper method and class is used inside GeneratesClassDiagram. 
 CalculateSize : Calculates the image size for the new class diagram based on how many  constructors, fields, properties, methods and events are present.

 RoundedRectangle :  This class creates a graphic path for GDI+ to draw. It aids with calculating  the rounded corners based on the given width, height and diameter. 

Graphics are drawn in 'Layers' to represent the class box and its shadow along with a gradient title containing the class name and type. Based on configuration settings each class type can have a different color.

AntiAliasing is applied to enhance text readability.

//Enable AntiAliasing
g.TextRenderingHint = TextRenderingHint.AntiAlias;

Finally when all the 'Layers' are drawn; a size comparison is done to avoid saving empty class diagrams.

Remarking the Code

Remarking the code is done by calling the RemarkAllImages() method.
It takes the code path and image path as variables and then proceeds to "Remark" all classes in code that have a resulting class diagram image.

/// <summary>
        /// Remarks all images.
        /// </summary>
        /// <param name="codePath">The code path.</param>
        /// <param name="imagesPath">The images path.</param>
        public static void RemarkAllImages(string codePath, string imagesPath)
        {
            string currentClassName = "";
            string remarkTemplate = Properties.Settings.Default.RemarkTemplate;
            string searchString = "";
            string currentClassRemark = "";

            int startIndex = 0;
            StreamReader sr;
            StreamWriter sw;
            FileInfo currentCodeFile;
            string currentCodeFileText = "";
            string outCodeFileText = "";
            if (Directory.Exists(imagesPath))
            {
                DirectoryInfo codeDirInfo = new DirectoryInfo(codePath);

                FileUtils.GetAllFilesInDir(codeDirInfo, "*.cs");

                foreach (string fileName in Directory.GetFiles(imagesPath, "*.png"))
                {
                    startIndex = 0;
                    try
                    {
                        currentClassName = Path.GetFileName(fileName).Replace(".png", "");
                        currentCodeFile = FileUtils.files.Where(f => f.Name == string.Format("{0}.cs", currentClassName)).FirstOrDefault();

                        if (currentCodeFile != null)
                        {
                            using (sr = new StreamReader(currentCodeFile.FullName))
                            {
                                currentCodeFileText = sr.ReadToEnd();
                                sr.Close();

                                //Finding the class description logic goes here. In essence it results in something like this:
                                //searchString = string.Format("public {0}", currentClassName);
                                

                                startIndex = currentCodeFileText.IndexOf(searchString);

                                //Add the remark
                                currentClassRemark = string.Format("{0}{1}\t", string.Format(remarkTemplate, currentClassName), Environment.NewLine);

                                if (!currentCodeFileText.Contains(currentClassRemark))
                                {
                                    outCodeFileText = currentCodeFileText.Insert(startIndex, currentClassRemark);

                                    using (sw = new StreamWriter(currentCodeFile.FullName, false))
                                    {
                                        sw.WriteLine(outCodeFileText);
                                        sw.Close();
                                    }
                                }

                                if (RemarkDone != null)
                                {
                                    RemarkDone(currentClassName);
                                }
                            }
                        }
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex.Message);
                        ErrorOccurred(ex, null);
                    }
                }
            }
        }

Have a look at the RemarkAllImages method above. 

It makes use of the FileUtils class to find all C# code files:

FileUtils.GetAllFilesInDir(codeDirInfo, "*.cs");

The method has a loop that then gets all .png files and with the help of LINQ it filters out all code files that do not have a resulting image.

FileUtils.files.Where(f => f.Name == string.Format("{0}.cs", currentClassName)).FirstOrDefault();

A bit of logic is applied to build up a search string based on the class type.
The search string then finds the class header and adds the remark above that using the remark template.

Once the string is appended it is written back to the code file.

 

Unmanaged DLL's

So this article is all good it it comes to .NET dll's or Managed dlls, but what about unmanaged dlls?
Unmanaged dll's can't be Reflected like .NET dlls. Well; thats where windows has this nifty dbghelp.dll that we can leverage for this information.

Have a look at the UnmanagedLibraryHelper. It is not yet implemented in the solution but its worth having a look at. 

public static List<string> GetUnmanagedDllFunctions(string filePath)
        {
            methodNames = new List<string>();

            hCurrentProcess = Process.GetCurrentProcess().Handle;

            ulong baseOfDll;
            bool status;

            // Initialize sym.
            status = SymInitialize(hCurrentProcess, null, false);

            if (status == false)
            {
                Console.Out.WriteLine("Failed to initialize sym.");
            }

            // Load dll.
            baseOfDll = SymLoadModuleEx(hCurrentProcess, IntPtr.Zero, filePath, null, 0, 0, IntPtr.Zero, 0);

            if (baseOfDll == 0)
            {
                Console.Out.WriteLine("Failed to load module.");
                SymCleanup(hCurrentProcess);
            }

            // Enumerate symbols. For every symbol the callback method EnumSyms is called.
            if (SymEnumerateSymbols64(hCurrentProcess, baseOfDll, EnumSyms, IntPtr.Zero))
            {
                Console.Out.WriteLine("Failed to enum symbols.");
            }

            // Cleanup.
            SymCleanup(hCurrentProcess);

            return methodNames;
        }

First we make sure we initialze the symbol handler by calling SymInitialize and give it a handle to the calling code.

[DllImport("dbghelp.dll", SetLastError = true, CharSet = CharSet.Unicode)]
        [return: MarshalAs(UnmanagedType.Bool)]
        private static extern bool SymInitialize(IntPtr hProcess, string UserSearchPath, [MarshalAs(UnmanagedType.Bool)]bool fInvadeProcess);

Then we load the unmanaged dll by calling SymLoadModuleEx; where we pass it the our handle and the dll's file path.

[DllImport("dbghelp.dll", SetLastError = true, CharSet = CharSet.Unicode)]
        private static extern ulong SymLoadModuleEx(IntPtr hProcess, IntPtr hFile, string ImageName, string ModuleName, long BaseOfDll, int DllSize, IntPtr Data, int Flags);

Once the dll has been loaded we have to enumerate all its functions. This is acheived by calling SymEnumerateSymbols64. The EnumSyms method allows us to read each enumeration and get the function name.

[DllImport("dbghelp.dll", SetLastError = true, CharSet = CharSet.Unicode)]
        [return: MarshalAs(UnmanagedType.Bool)]
        private static extern bool SymEnumerateSymbols64(IntPtr hProcess, ulong BaseOfDll, SymEnumerateSymbolsProc64 EnumSymbolsCallback, IntPtr UserContext);

private static bool EnumSyms(string name, ulong address, uint size, IntPtr context)
        {
            if (!name.Contains("Ordinal"))
                methodNames.Add(string.Format("{0}", name));

            return true;
        }

A bit of cleanup is required after loading the dll and getting its functions. this is acheived by calling SymCleanup and passing it our handle. 

// Cleanup.
SymCleanup(hCurrentProcess);

There you have it. Its not as complete a picture of what is going on inside the dll as with managed dlls, but it at least gives us a starting point. From here on one can start identifying methods and properties and paired with the ClassInfo class it can be used to draw class diagrams.

Points of Interest

We already use this tool quite extensively at our place of work and it is in the process of being expanded to include EF helper class generation, SQL Database commenting from comments in code and database dictionary generation. Hopefully one-day I can upload it here as a tool that can assist an even greater audience.

History

Version 1: Class diagram generator with remark generation.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here