Introduction
This drop-in function provides a simple template based text generation engine. It basically allows code to be encapsulated in special tags that can manipulate or insert text. If you have worked with classic ASP, JSP, PHP or T4 templates, you're probably familiar with text template transformations. In classic ASP, code is wrapped in <%...%>
for example:
<body>Current time:<br /><%Response.Write Now()%></body>
In T4, it is wrapped like:
Current time:<#= DateTime.Now #>
And finally in this project, we do it like:
Current time: [[=DateTime.Now ]]
or:
Current time: /*=DateTime.Now :*/
Basically text transformations, also called dynamic text, allows the use of programming methods to modify text. Common uses might be for repeating sections of text, filling in fields on an ASP page, showing a username or account code in an email, or to write 1 to 1000 on a webpage.
This project is similar to Microsoft's T4, but simpler. It is similar because it uses encapsulated C# code to inject text. Microsoft Visual Studio's T4 is more powerful and this project is not meant to be a replacement... at least not inside of Visual Studio! There are a couple of issues when using T4 templates in 3rd party applications. The foremost is the licensing. The T4 DLL is not redistributable. There are funky ways around this by installing some MS packages that have the DLL or installing Visual Studio express but that is messy. This project by contrast is not even a DLL, it is a simple drop in function. Another issue is T4 does not mesh well with many syntax highlighting and code completion projects. To work around this, I created \*: code-here :*\
like commands in comments. This allows this templating system to be used directly in C#/C++ files within causing havic on design time error checking.
Here are some transformation examples. The intermediate step just shows what gets executed to create the final output. This examples uses ]],[[,[[=,[[!
as for encapsulating code.
Original | Intermediate Step | Final Output |
1[[for(int i=0; i<9; i++){]]0[[}]] | Write(“1”); for(int i=0; i<9; i++){ Write(“0”);} | 1000000000 |
1[[~for(int i=0; i<9; i++)]]0 | Write(“1”); for(int i=0; i<9; i++) Write(“0”); | 1000000000 |
Printed [[=DateTime.Now]] | Write(“Printed ”); Write(DateTime.Now); | Printed 1/4/15 2:36PM |
[[=i++]]. A [[=i++]]. B | Write(i++);Write(“. A”);Write(i++); Write(“. B”); | 1. A 2. B |
Example of running the source files included: (The top part is the input to the function and the bottom is the output of the function... as simple ast that.)
Background
This function was built because of a need for a simple text template transformation engine for an AMD GCN assembly language project I am working on. In assembly, a pre-compile, macro like feature is very useful – almost required. Very often, you might run into a situation like having to unroll a looped loop. Since pure assembly languages does not support unrolling of “for
” or “while
” like higher level languages, it is often left to the programmer to do these. Working with and maintaining a few ugly template code lines of code is much better than writing ten assembly statements fifty times.
For example:
[[ for(int i = 0; i < 4; i++){ ]]
Add R[[=i+20], R4, [[=i]]; [[}]]
Would be transformed into...
Add R20, R4, 0;
Add R21, R4, 1;
Add R22, R4, 2;
Add R23, R4, 3;
Originally, I was planning on using Microsoft’s T4 but after some investigation, I found it required a DLL that was not redistributable. It seemed pretty easy and fun to create a text transformation template engine so I set forth. The goal was to keep it as simple as possible because I might want to adopt it for different uses in the future and if there was lots of junk, then adjusting it would be difficult.
Using the Code
Just drop in the function or static
class and then make a call to the function.
- First, copy the function into your application. Make the function
public
, private
or internal
as needed. - Select the formatting you wish to use by uncomment the style in the header. There are two formats:
[[CODE]]
, [[=EXPRESSION]]
, [[~FULL_LINE_OF_CODE
, and [[!SKIP_ME]]
- easier to read (recommended) -
/*:CODE:*/
, /*=EXPRESSION:*/
, //:FULL_LINE_OF_CODE
, and //!SKIP_ME
- works better with c-like code completion and syntax highlighting
- Or, create your own
- Build some text (as a
string
) that needs to be converted. Use the following table for reference:
| “[[..]]” Style | “/*:..:*/” Style | comments |
Code Block | [[ code_here ]] | /*: code_here :*/ | normal usage |
Code Line | [[~code_here | //: code_here | terminates with line break |
Expression | [[=variable]] | /*= variable:*/ | wraps var in write(...) |
Comment Block | [[! comments ]] | /*! comments :*/ | excluded in final |
Comment Line | (none) | //! comments | ends with line break |
IDE Code Only | (none) | /**/ IDE code /**/ | dummy/filler IDE only code |
-
Call Expand(...)
in your application. It takes two string
parameters. The first string
parameter should have the input text with the encapsulated C# commands. The second string
parameter will hold the results. Lastly, Expand()
returns true
if successful or false
if there are any compiler error(s).
Usage: bool success = Expand(myInput, out myOutput);
-
Debugging: Compile-time errors, will be returned in the output parameter (instead of the results). The function will list each error with line and column information. Directly after the errors, the intermediate code will be displayed for reference. If you would like to go farther and correct runtime errors, just grab those contents of the program
variable and copy and paste them into a new Visual Studio console project. There is an included Main()
so the contents can just be dropped into a file and run.
How It Works
In a nutshell, the Expand()
function takes a string
, converts that string
into a program (Step 1), compiles the program (Step 2), and finally runs that program to collect its output (Step 3).
Here is the entire code:
public static bool Expand(string input, out string output)
{
const string REG = @"(?<txt>.*?)" +
@"(?<type>\[\[[!~=]?)" +
@"(?<code>.*?)" +
@"(\]\]|(?<=\[\[~[^\r\n]*?)\r\n)";
const string NORM = @"[[", FULL = @"[[~", EXPR = @"[[=", TAIL = @"]]";
System.Text.StringBuilder prog = new System.Text.StringBuilder();
prog.AppendLine(
@"using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
class T44Class {
static StringBuilder sb = new StringBuilder();
public string Execute() {");
foreach (System.Text.RegularExpressions.Match m in
System.Text.RegularExpressions.Regex.Matches(input + NORM + TAIL, REG,
System.Text.RegularExpressions.RegexOptions.Singleline))
{
prog.Append(" Write(@\"" + m.Groups["txt"].Value.Replace("\"", "\"\"") + "\");");
string txt = m.Groups["code"].Value;
switch (m.Groups["type"].Value)
{
case NORM: prog.Append(txt); break;
case FULL: prog.AppendLine(txt); break;
case EXPR: prog.Append(" sb.Append(" + txt + ");"); break;
}
}
prog.AppendLine(
@" return sb.ToString();}
static void Write<T>(T val) { sb.Append(val);}
static void Format(string format, params object[] args) { sb.AppendFormat(format,args);}
static void WriteLine(string val) { sb.AppendLine(val);}
static void WriteLine() { sb.AppendLine();}
static void main() { Console.Write(sb.ToString());} }");
string program = prog.ToString();
var res = (new Microsoft.CSharp.CSharpCodeProvider()).CompileAssemblyFromSource(
new System.CodeDom.Compiler.CompilerParameters()
{
GenerateInMemory = true,
ReferencedAssemblies = { "System.dll", "System.Core.dll" }
}
, program);
res.TempFiles.KeepFiles = false;
if (res.Errors.HasErrors)
{
int cnt = 1;
output = "There is one or more errors in the template code:\r\n";
foreach (System.CodeDom.Compiler.CompilerError err in res.Errors)
output += "[Line " + err.Line + " Col " + err.Column + "] " +
err.ErrorText + "\r\n";
output += "\r\n================== Source (for debugging) =====================\r\n";
output += " 0 10 20 30 40 50 60\r\n";
output += " 1| " + System.Text.RegularExpressions.Regex.Replace(program, "\r\n",
m => { cnt++; return "\r\n" + cnt.ToString().PadLeft(4) + "| "; });
return false;
}
var type = res.CompiledAssembly.GetType("T44Class");
var obj = System.Activator.CreateInstance(type);
output = (string)type.GetMethod("Execute").Invoke(obj, new object[] { });
return true;
}
Step 1) Build the Generator Program - The input text, with embedded C# commands, is fed through a Regular Expression to parse out the different sections. The input text by nature is going to be in the format TEXT-CODE-TEXT-CODE… so we process each TEXT-CODE at a time. Here is the RegEx used for deciphering each TEXT-CODE:
(?<txt>.*?)
<- This captures any normal text that will directly outputted with Write(“text here”). (?<type>\[\[!|\[\[\~|\[\[|\[=)
<- This gets the begin bracket and the type of it. It can be [[ , [= [[!
. (?<code>.*?)
<- This captures the code piece. (\]\]|(?<=\[\[[^\r\n]*?)\r\n)
<- This captures the closing bracket.
The goal is to convert the source text into a program so we can execute it. For each <txt>
, we append an sb.Append(txt)
where sb
is a StringBuilder
. For each <code>
we directly write the text – it is not wrapped in a sb.Append()
. The beginning and ending brackets and anything that starts with a “[[!
” are stripped out and not copied over.
In this first step, the program header and footer are also added. In the header, we add some using
statements, a class header, and function header. In the footer, we add some useful functions like “Write(...)
” and “WriteLine(...)
” and finally complete the class with a “}
”.
One other item to note is that before we run the RegEx, a “[[]]
” is appended at the end. (text + NORM + TAIL
). This is because the RegEx is looking TEXT-CODE chunks and this means we must end with a CODE
. In this case, it’s just an empty code “[[]]
”.
Step 2) Compile the generator program - The program we built in Step 1, is then compiled using the .NET CSharpCodeProvider
. GenerateInMemory
does not save the file into RAM but rather a temporary folder. TempFiles.KeepFiles = false
must be set to ensure these files are cleaned up. Also in this step, we print out any errors.
Step 3) Run the program to collect the output – In the last step, we invoke the mini-program we generated and return its output.
Sample Input/Output
Sample Input
This first example will write Hello World
three times:
[[~ for(int i=0; i<3; i++){
Hello World [[ Write(i.ToString()+"! "); }
]]
[[! This comment will not be added to the output. ]]
Write()
will print any bool
, string
, char
, decimal
, double
, float
, int
...
A Quadrillion is 1[[ for(int i=0; i<15; i++) Write("0");
]]
This will also write bool
, string
, char
, decimal
, double
, float
, int
...
Hello at [[=DateTime.Now
]]!
[[ for(int i=1; i<4; i++){
]]
[[="Hello " + i + " World"+ (i>1?"s!":"!")
]]
How are you? "[[=i
]]" [[="\r\n"
]]
[[ }
]]
The Intermediate Generated Code
The following is the behind-the-scenes temporary generated code that was created from the sample input. This will be executed in the next step to create the final output. The block below is generated code and the formatting is not clean.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
class T44Class {
static StringBuilder sb = new StringBuilder();
public static string Execute() {
Write(@"
"); for(int i=0; i<3; i++){
Write(@"Hello World "); Write(i.ToString()+"! "); } Write(@"
"); Write(@"
Write() will print any bool, string, char, decimal, double, float, int...
A Quadrillion is 1"); for(int i=0; i<15; i++) Write("0"); Write(@"
This will also write bool, string, char, decimal, double, float, int...
Hello at "); sb.Append(DateTime.Now); Write(@"!
"); for(int i=1; i<4; i++){ Write(@"
"); sb.Append("Hello " + i + " World"+ (i>1?"s!":"!") ); Write(@"
How are you? """); sb.Append(i); Write(@""" "); sb.Append("\r\n"); Write(@"
"); } Write(@"
"); return sb.ToString();}
static void Write<T>(T val) { sb.Append(val);}
static void Format(string format, params object[] args) { sb.AppendFormat(format,args);}
static void WriteLine(string val) { sb.AppendLine(val);}
static void WriteLine() { sb.AppendLine();}
static void Main(string[] args) { Execute(); Console.Write(sb.ToString()); } }
Sample Output
This first example will write Hello World
three times:
Write()
will print any bool
, string
, char
, decimal
, double
, float
, int
...
A Quadrillion is 1000000000000000
This will also write bool
, string
, char
, decimal
, double
, float
, int
...
Hello at 1/18/2015 8:48:13 AM!
Hello 1 World!
How are you? "1"
Hello 2 Worlds!
How are you? "2"
Hello 3 Worlds!
How are you? "3"
When Not to Use this Code
- Security – Since the function compiles and runs commands (like a script), it has the potential to be abused. Be cautious of what or who might call this function and what permission levels the program is running in.
- Not a replacement for T4 in Visual Studio. T4 is built into Visual Studio so use that. It is also more feature rich, more commonly known, and easier to debug in newer versions of Visual Studio.
- Avoid using templates if possible. Be careful not to jump in and use text templates. They can be confusing for others and make code complicated. Make sure you need them first. If the structure of the text template is always the same, then just write the code. For example, don’t use template transformation to do
Current time: [[=DateTime.Now ]]
when "Current time:" + DateTime.Now.ToString()
would suffice. Also, the performance is not that great.
Points of Interest
The most enjoyable part of the project was creating the language. The main goal was for it to be simple and easy to read. My first version used “#
” for inline code but it was not as clean as I wanted. After some experimentation, the [[...]]
style won out. But after toying with [[...]]
for a while, I noticed that it wreaked havoc with code completion and syntax highlighting engines. After additional experimentation, I had an idea to use the built in opening/closing comments but with a twist to separate them from normal comments. Eventually, a style like (/*: ... :*/
and //: ...
) pervaled. Since comments are ignored by code completion and syntax highlighting engines, the inline code would also be ignored. This worked well except in the instance when there needed to be some kind of filler code for the editor. Here is an example:
int myVar = /*: for(int i=1; i<4; i++) Write(i) :*/;
shows as an error in the designer because the codesense sees “int myVar = ;
”
but modifying it like this fixes the issue....
int myVar = /*: for(int i=1; i<4; i++) Write(i) :*/ /**/1/**/;
works because the editor will see “int myVar = 1;
”
Both of the above would work okay however after using the template function on them. They would expand out to “int myVar = 123;
” but the first one would just show an error in the IDE.
Compatibility
- no DLLs required
- no
using
statements needed - works in both x64 and x86
- directly runnable in .NET 3.5, 4.0, 4.0 Client Profile, 4.5, and 4.51
- Also works in .NET 2.0, 3.0, 3.5 (Client Profile) if "System.Core.dll" and "
using System.Linq;
" are removed. - Tested okay under Visual Studio 2010/2012/2013/2015
Performance
The included sample takes 81ms (i7 2nd Gen, 3.2Ghz, SSD). Release, debug, and release without debugger all had similar performances.
Breakout:
- 1ms to generate code
- 76ms for compile
- 3ms execute program
History
- December 2014: Started
- 3rd January, 2015: Initial version
- 17th January, 2015: Removed linq code (works better with pre .NET 3.5)
- 19th January, 2015: Added missing "
const
" - 21st January, 2015: A few changes: