Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Ruby

EGen – a scalable code generation and maintenance framework

4.18/5 (10 votes)
24 Aug 2008CPOL12 min read 3   470  
An article on EGen - a scalable code generation and maintenance framework for C/C++/C#/Java, implemented in Ruby.

Introduction

This article discusses EGen - a code generation framework implemented in Ruby that attempts to go beyond the classic substitution of variables in templates. It allows you to create stand-alone templates or embed marked template blocks into C/C++/C#/Java code. These templates contain Ruby code for simple generation, but are also able to import and process designated portions from other files in the project that otherwise would need to be synchronized and maintained manually. It applies the Don't Repeat Yourself (DRY) principle by keeping its metadata in one place (a lightweight text database) and letting you use it to generate various portions of the project's code. Keeping the metadata database and templates as text makes it easy to add them to the repository with the rest of the project files, and can be restored or branched along with them.

The main focus of the framework is on active code generation. The templates become part of the host project, and allow you to automatically refresh the code as the metadata changes and maintain dependencies among code sections. Upon such changes, traditional wizards can only re-create the skeleton code, and you need to manually carry over all the custom modifications from the old code into the new one. For completeness, some passive code generation capability is included, although it is less important now that most modern IDEs include plenty of wizards.

The demo included with the article shows its use in a C++ project using Visual Studio 2005. The framework, however, is dependent only on Ruby and two Ruby gems; therefore, it can be used with other development environments and/or languages that use slash-star (/* */) comments. The included add-in is specific to Visual Studio 2005.

Background

Following various proposed code generation solutions as well as drawing from my experience, I decided to create something that would:

  • be easy to use, ideally having no learning curve
  • have minimal dependencies
  • have the possibility to embed template blocks into ordinary code files, and also have stand-alone templates for entire file generation
  • separate metadata and logic (generation scripts)
  • provide a safe and simple way to manage metadata
  • provide a way to version metadata
  • be interpreted so that changes would take effect immediately

To avoid reinventing the wheel, I wanted to use as many off-the-shelf components as possible, and after careful deliberations, I settled on Ruby as the language for my templates and framework implementation, erb as the template processor, and KirbyBase to store and manipulate the metadata. Using a full-fledged interpreted language is advantageous because it comes with plenty of libraries and facilities that give you access to databases, to the network, to arbitrary files, and allows you to perform arbitrary calculations and then generate code based on them. A Domain Specific Language would be hard pressed to match all that.

How it works

The framework is implemented as a series of Ruby modules and scripts. Some are ready out of the box, and some are automatically created when a project is initialized for code generation. A Visual Studio 2005 add-in is provided with the framework to facilitate the use of scripts directly from the IDE. You can add stand-alone templates (*.template) to your project and have them processed into code files, or you can embed template blocks into your source code. Embedded template blocks appear as slash-star comments to your compiler, and are ignored by it. Only the generated part is compiled and used. Also, you can declare blocks (such as interface definitions) in your source files, delimited by markers, and then import and use them into other files by means of embedded templates. Destination files that turn out identical with the generated code are not overwritten, to avoid confusing the code versioning system.

Installation

First, if you don’t have Ruby on your workstation, grab the Ruby One-Click installer from here and install it. After that, download the egen_install.zip archive from the top of this article, unzip it in a folder of your choice, and double-click on egen_install.bat. This will open a command window and proceed to install the two prerequisite Ruby gems (Facets and KirbyBase); then, it will launch the MSI installer for the EGen add-in. The add-in installer puts the framework scripts in c:\egen, and adds the following four commands to the Tools menu in Visual Studio 2005:

  • EGen - Process current file
  • EGen - Process all templates
  • EGen - Scan for templates
  • EGen - Prepare project for code generation

It also sets two environment variables: RUBYLIB=c:\egen and RUBYPATH=c:\egen.

Usage

Each project that will use this code generation framework needs to be initialized. You can do this from the IDE by selecting EGen - Prepare project for generation from the Tools menu. It will use the current project's path and name for initialization. Ensure that the desired project is selected so the the IDE will be able to pass the right info to the script.

The preparation step adds the following to your project folder:

  • folder %project_root%gen to host the result of processing stand-alone templates.
  • folder %project_root%meta to host stand-alone templates (*.template files).
  • folder %project_root%meta/data to hold the project’s database (KirbyBase).
  • folder %project_root%rscript where all the project specific custom scripts go.
  • file %project_root%rscript/%proj_name%_gen.rbw containing a custom generator class created on the fly for the project (here, you can add specific methods that will not be visible from other projects).
  • file %project_root%rscript/egen_all.rbw containing a script for re-generating every suitable file in the project, both stand-alone templates and embedded ones.
  • file %project_root%rscript/egen_xone.rbw containing a script for manual re-generation of single files from the IDE.

Once you have the project prepared, you can add the stand-alone templates (see DEMO_Enums.template in the demo project) in the meta folder and process them by running EGen - Process current file (i.e., the one open in the editor) or EGen - Process all templates from the Tools menu. New template additions to the project can be identified and added to the processing queue by running EGen - Scan for templates. To be identified by the scanner script, the stand-alone templates must contain a destination directive of this form:

ERB_DESTINATION = "relative_destination_file_path_here"

This path has to be relative to the rscript folder.

Adding template blocks into your code is even simpler. Here is a sample populating a vector with a predefined sequence of integers:

/* BGN::ERB_DEFINITION
  std::vector< int > my_vector;
% [1,3,4,67,43,88,95,65,84,68].each do |val|
  my_vector.push_back(<%= val %>);
% end
END::ERB_DEFINITION*/
  std::vector< int > my_vector;
  my_vector.push_back(1);
  my_vector.push_back(3);
  my_vector.push_back(4);
  my_vector.push_back(67);
  my_vector.push_back(43);
  my_vector.push_back(88);
  my_vector.push_back(95);
  my_vector.push_back(65);
  my_vector.push_back(84);
  my_vector.push_back(68);
/*END::ERB_EXPANSION*/

A cos() lookup table (shown only partially here for brevity):

/*BGN::ERB_DEFINITION
 std::vector< double > m_COSLookup;
% (0..360).each do |a|
  <%= "_CB_ #{a}_CE_ m_COSLookup.push_back(#{("%0.10f" 
                   % Math::cos(a.to_f/180*Math::PI))})" %>;
% end
END::ERB_DEFINITION*/
 std::vector< double > m_COSLookup;
  /* 0*/ m_COSLookup.push_back(1.0000000000);
  /* 1*/ m_COSLookup.push_back(0.9998476952);
  /* 2*/ m_COSLookup.push_back(0.9993908270);
  ...
  /* 359*/ m_COSLookup.push_back(0.9998476952);
  /* 360*/ m_COSLookup.push_back(1.0000000000);
/*END::ERB_EXPANSION*/

And the filling of a map with file types:

/*BGN::ERB_DEFINITION
% %W{cs vb vj h cpp}.each do |type| 
%  aft_txt = "boost::shared_ptr< CAppFileType >(
     new CAppFileType(\"*.#{type}\", icon#{type.upcase}, #{type}_ID)"
   aft_ptr = <%= aft_txt %>;
   m_AppFileTypes[<%= "#{type}_ID" %>] = aft_ptr;
% end
END::ERB_DEFINITION*/
   aft_ptr = boost::shared_ptr< CAppFileType >(
      new CAppFileType("*.cs", iconCS, cs_ID);
   m_AppFileTypes[cs_ID] = aft_ptr;
   aft_ptr = boost::shared_ptr< CAppFileType >(
      new CAppFileType("*.vb", iconVB, vb_ID);
   m_AppFileTypes[vb_ID] = aft_ptr;
   aft_ptr = boost::shared_ptr< CAppFileType >(
      new CAppFileType("*.vj", iconVJ, vj_ID);
   m_AppFileTypes[vj_ID] = aft_ptr;
   aft_ptr = boost::shared_ptr< CAppFileType >(
      new CAppFileType("*.h", iconH, h_ID);
   m_AppFileTypes[h_ID] = aft_ptr;
   aft_ptr = boost::shared_ptr< CAppFileType >(
      new CAppFileType("*.cpp", iconCPP, cpp_ID);
   m_AppFileTypes[cpp_ID] = aft_ptr;
/*END::ERB_EXPANSION*/

It may look like a little too much effort for inserting five pairs into the map initially, but when you later decide to add the following: cc cxx c hpp hh hxx js cd resx res css htm html xml xsl xslt xsd, it will be a lot simpler and faster to add them to the definition rather than copying, pasting, and editing 34 lines of code. Even better, in a real application, the list of extensions can go in a metadata table, and several templates will take care of defining type IDs, icon IDs as well as populating a map similar to the above with the right instances. Once the templates are in place, only the table will need to be modified, and the templates will take care of the rest. Don't forget to re-generate the files after every change of template definitions. This task can be automated by adding the egen_scan.rbw and egen_all.rbw scripts to the Pre-Build Event phase of your project. Please check the erb and KirbyBase documentations to get more info on writing templates and manipulating database tables.

Trying it out

The framework comes with a few examples to illustrate how it works. They are all contained in the source code of the attached demo application, and make use of the included scripts.

A. Generating smart enums for a C++ app

This simple application of the framework takes a table containing your metadata, and generates a nicely aligned header, and implementation files that declare your enums, and some auxiliary functions that may be helpful during application development. And on top of that, you get a comprehensive summary at the beginning of the header file to spare the other programmers the effort to dig up what’s available. Every time you need to add a new enum category or get rid of an existing one, you only need to edit the enum table, and the related declarations and functions will be created or removed by processing the templates.

For details, see the following demo files: DemoEnums.template, DemoEnums.h, DemoEnums.cpp and the table enums.tbl under gen_demo\meta\data.

B. In-file expansion of template

This facility uses the ability of the framework to expand portions of a regular file containing code for a variety of purposes such as defining the initial state of a class, filling collections, initializing look-up tables, writing tests, or inserting SQL in compiled code with minimal effort.

The following markers must be used to delimit the definition section and the expansion section:

/*BGN::ERB_DEFINITION
   the definition to be expanded comes here …
END::ERB_DEFINITION*/
   the result of processing the above definition
   will be written in this section,
   the previous filling of this section 
   is unconditionally overwritten,
   therefore any manual edits will be lost …
/*END::ERB_EXPANSION*/

To process such embedded templates, use the EGen - Process current file or EGen - Process all templates commands from the Tools menu in Visual Studio.

For sample templates, see the following demo files: DEMO_Enums.cpp, Fruit.h, and Vegetable.h.

C. Synchronizing a class declaration with the interfaces it implements

This is just a particular case of expansion of an embedded template where the expanded text is taken from another header file and processed before being used to fill the expansion section.

The interface declaration should be surrounded by markers of this general form:

/*BGN::MARKER_WORD*/
   here goes the interface code…
/*END::MARKER_WORD*/

EGen provides a get_interface function that returns the desired interface block from a given file. The function is declared as follows:

C++
get_interface(file_path, block_marker="INTERFACE", remove_zero=true)

As usual, the file path has to be relative to the rscript directory, and the block_marker has to identify the exact block wanted from the file. The last parameter, remove_zero triggers the automatic removal of the “ = 0” from the tail of C++ virtual functions.

For details, see the following demo files: IEdible.h, Fruit.h, and Vegetable.h.

D. Generating a C++ class (header and implementation files) and an interface (header only)

This is the simplest use of the framework; it helps speed up the process of creating new entities in the application code, and makes compliance with internal guidelines and standards a snap. Unlike the rest of the framework which is focused on active code generation, these are passive generators meant to reduce typing. Run the script egen_mk_class.rbw with a namespace and a class name in the command line, and it will generate a header and implementation file for that class according to the templates given. Interface declarations (they are simply structs with only virtual methods and no attributes in C++) can be similarly generated by running the egen_mk _interface.rbw script. To run the scripts, type one of the following commands at the prompt:

ruby –S egen_mk_class.rbw –f target_folder –n 
     namespace_label class_name_1 .. [class_name_n]

ruby –S egen_mk_interface.rbw –f target_folder –n 
     namespace_label interface_name_1 .. [interface_name_n]

where:

  • target_folder – points to where you want the entity created; it can be relative or absolute.
  • namespace_label – the namespace to be used in the class or interface declarations.
  • class_name / interface_name – add as many class or interface names as you need to be created in the target folder and within the specified namespace; if the class name starts with ‘C’, such as CSomething, then the ‘C’ will be removed from the file names, resulting in the creation of the files: Something.h and Something.cpp; the class itself will be named as intended.

Feel free to copy the templates from the demo application, and tweak them to fit your internal standards.

In the command line, make sure you are in the rscript folder under your project root so that the script will be able to find the templates. For details, see the following demo files: ipp.template, hpp.template, cpp.template, and the scripts egen_mk_interface.rbw, egen_mk_class.rbw, and egen_class_maker.rbw.

E. Duplicating copyright notices and other repetitive text

This is another case of expanding embedded templates that extract text from one source file and insert it in their expansion area. The EGen function used here is get_block declared as:

C++
get_block(file_path, block_name, block_prefix = 'BGN::', block_postfix = 'END::')

The file_path and block_name conform to the same requirement as for get_interface, while the block_prefix and block_postfix are best left alone for consistency. Unlike get_interface, this function retrieves the content between the source markers without changing or processing it in any way.

For details, see the top of the following demo files: IEdible.h, Fruit.h, Fruit.cpp, and Vegetable.h.

Conclusion

The sample uses provided for the framework in this article are only scratching the surface of possible uses. Due to its open nature, the framework can be extended and improved to fit the needs of any developer or teams, and scaled for projects of any size.

Please note that some things in my implementation are specific to its use in generating C/C++ code. Such is the case with the comment bracketing, which may not work once you move to generating for non C-descendant languages. It should be trivial to change the comment markers to fit your preferred language.

Also, although the framework code and all its prerequisites are platform independent, the implementation I posted with this article has been developed and tested in Windows only. There is, however, no reason to prevent it from working on any other system that supports Ruby. Some minor adjustment accounting for differences between Operating Systems may be necessary.

So far, this framework has done an outstanding job, assisting me throughout many programming tasks, and I hope many of you will find a place for it in your toolkit. I am looking forward to hearing your opinions and suggestions.

Future plans include:

  • Adding a C# demo;
  • Adding a SQL Server demo;
  • Creating a GUI for visual manipulation of the metadata tables (KirbyBase);
  • Creating a GUI for the class and interface skeleton generator scripts;
  • Implementing a flexible way to support other languages that don’t accept C-like comment bracketing, and testing/adjusting it to work on other platforms and IDEs.

Acknowledgements:

Many thanks to the authors of the following tools that enabled me to build EGen:

  • The Ruby language (Yukihiro Matsumoto and the Ruby team).
  • The Ruby Facets library (Thomas Sawyer and all Facets contributors).
  • KirbyBase (Jamey Cribbs).
  • The Pragmatic Programmers (Andrew Hunt and David Thomas) for their books on Ruby and programming in general.

History

  • 11th Aug 2008 – Initial release.
  • 14th Aug 2008
    • Improved the script associated with EGen - Process current file to process both *.template files and embedded templates from the IDE;
    • The preparation step can now be run from the Tools menu just like the others by selecting EGen - Prepare;
    • Trimmed the article.
  • 24th Aug 2008
    • Introduced the first release of the EGen - Add-in for Visual Studio 2005 to simplify day to day use and installation.
    • Added a simple installer to eliminate the previously tedious manual steps.
    • Edited the article to reflect the new changes.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)