Introduction
I wanted to write a framework and GUI that allowed me to quickly write and use various sample data generators. I'm constantly writing small console applications to create random, but real-world sample data for pre-populating database applications during testing, or for writing unit-tests against.
This project sets out to achieve the following goals:
- Define an interface specification so all of my sample data generators can be used in a similar way
- Allow the new Data Generators to be called programmatically, for example within my unit tests
- Allow the Data Generators be plugged into a GUI application where I can set properties and format the output result
IDataGenerator<T> Interface
All sample data generators that fit into this framework implement the generic interface
IDataGenerator<T>
public interface IDataGenerator<T>
{
void Setup();
void TearDown();
T Next();
List<T> Next(int count);
string ToString(T value);
string[] ToStringArray(List<T> values);
}
Each method is described here:
Setup
This method should setup the data generator ready to be called. For example, in the Names generator, Setup
loads all the first and last names from resources; doing this every call to next would be a massive performance hit and un-necessary. This method is called before any calls to Next
by the GUI framework, or if being called in code, Setup
should be called before calling Next
or Next(count)
.
Teardown
This method should clean-up memory and resources the data generator might have initialized in Setup
. This method is called After any calls to Next
by the GUI framework, or if being called from code, TearDown
should be called when you no longer need the data generator.
Next
This is the main method that generates a single instance of sample data. The return
type is of type <T>
, declared in the data generator class when compiled.
Next(count)
This method calls Next()
in a for
-loop count times and adds the result to a strongly-typed generic List<T>
. This method is almost always as simple as:
public List<T> Next(T count)
{
List<T> result = new List<T>(count);
for (int i = 0; i < count; i++)
result.Add(Next());
return result;
}
ToString(value)
This method allows you to format and control how each data generator will return as a string
value. For example, in the Integer data generator this calls a .ToString()
on the numeric type. In almost all cases, this method will call ToString()
, but in case you need to do further formatting based on a property exposed in the data generator, this is where you would write that code.
ToStringArray(List<T>)
Returns a string[]
from a List<T>
. This is a helper method that will iterate the List<T>
and return
a string
array formatted using the ToString(value)
method call. This method is always as simple as:
public string[] ToStringArray(List<int> values)
{
List<string> result = new List<string>(values.Count);
foreach(int i in values)
result.Add(ToString(i));
return result.ToArray();
}
By adhering to this interface, using a data generator follows the pattern call Setup
, set
properties, call Next(count)
, and when you are all done call TearDown
. Here is an example calling the IntegerGenerator
to create 100
int's between 500
and 1000
:
public List<int /> Get100RandomInts()
{
IntegerGenerator generator = new IntegerGenerator();
generator.Setup();
generator.MinValue = 500;
generator.MaxValue = 1000;
return generator.Next(100);
}
This might be useful for unit testing, but I also wanted to be able to easily run them from a graphical user interface application that allowed generators to be easily added.
I have written a step-by-step document on how to write new plug-in Data Generators, which takes you through the process of building the IntegerGenerator
from scratch, you can get it from my website here.
The Sample Data Generator GUI Application
My goal for the GUI application was to allow new generators to be discovered if I dropped their assembly into the same folder that the GUI application was running from. I didn't want to have to edit config files. I also needed the GUI to dynamically handle different properties that customize the generation process. For example, when generating BirthDates
, you need to specify the minimum and maximum age you want, or if you are generating people's names, should they be male or female, etc.
I took the following approach:
- When the application starts it looks in the current executing assemblies directory for all .dll files
- Each .dll file is loaded and each Type is looped through using Reflection looking for a custom attribute we defined as
DataGeneratorAttribute
A list of those classes decorated with that attribute are displayed in a listbox
As the user selects one of the generators from the list, we construct an instance of that type using Reflection, and this object is attached to .NET's highly useful PropertyGrid
control- When the user clicks
Generate
, we call Setup
, then Next(count)
, then TearDown
and write the results to a multiline TextBox
in chosen formats
There is actually very little code to all of this, mainly due to the fully featured PropertyGrid
which is familiar from the Visual Studio IDE, but can be created and used in our own applications as well. It also supports attributes which show documentation in a panel below the properties which really adds to the usability of the whole interface.
To expose a class to the GUI harness, we simply add an DataGeneratorAttribute
attribute to our class, best illustrated with an example from the simple IntegerGenerator
:
[DataGeneratorAttribute(Name = "Integers", ReturnType =
typeof(int), Description = "Generates random integer values.")]
public class IntegerGenerator : IDataGenerator<int>
{
The properties are mostly self-explanatory, Name
is the name displayed in the listbox
, ReturnType
is the type that Next()
would return, and Description
is the tooltip that will be displayed in the listbox
when this generator is chosen.
Here is the code which finds all of the IDataGenerators
in the same directory our GUI application is running from, adding them to a list which we simply bind to a listbox
control: -
private void findAllDataGenerators(string path)
{
string[] files = Directory.GetFiles(path, "*.dll");
foreach (string file in files)
{
Assembly a = Assembly.LoadFile(file);
Type[] types = a.GetTypes();
foreach(Type t in types)
{
object[] attributes = t.GetCustomAttributes
(typeof(Aspiring.DataGenerator.DataGeneratorAttribute), true);
if (attributes.Length == 1)
{
DataGeneratorInfo g = new DataGeneratorInfo();
g.Assembly = a;
g.DataGeneratorType = t;
g.Attribute = (DataGeneratorAttribute)attributes[0];
generators.Add(g);
}
}
}
}
The code which creates an instance of our Data Generator once the user has chosen one and assigns it to a PropertyGrid
control is as simple as:
currentDataGeneratorInfo =
(DataGeneratorInfo)listBoxGenerators.Items[listBoxGenerators.SelectedIndex];
Type t = currentDataGeneratorInfo.DataGeneratorType;
currentGenerator = Activator.CreateInstance(t);
propertyGrid.SelectedObject = currentGenerator;
The PropertyGrid
control automatically displays any public
Property for the selected object. We use this feature by exposing properties on our Data Generators to allow specific control. An example is MinAge
and MaxAge
on our birthdate generator. The property grid will display these values and allow the user to change them. Validation is also handled by throwing an ArgumentOutOfRangeException
exception which is trapped by the PropertyGrid
control and forces the user to enter a value within range. We can also add documentation by decorating the property with a Description
attribute. An example of a properly decorated and validated attribute looks like:
[Description("The minimum age allowed for the birthdate.
Must be less than MaxAge.")]
public int MinAge
{
get { return _minAge; }
set
{
if (value > MaxAge)
throw new ArgumentOutOfRangeException
("MinAge", "MinAge must be less than or equal to MaxAge." );
_minAge = value;
}
}
The final step is to actually get the generators to do their job, create sample data and format it on the screen. Formatting is pretty crude at the moment, I've hard-coded a few common ways but think some kind of formatters might be useful here; off the top of my head I'd like it to generate database scripts, XML files, just to name a few. VB support might also be useful, but for now, my needs are met. I just need List<T>
's! The actual code to generate the data uses reflection to call methods on the current instantiated generator. It isn't pretty, reflection never is, but it looks like this:
IEnumerable q;
currentGenerator.GetType().InvokeMember("Setup", BindingFlags.InvokeMethod |
BindingFlags.Instance | BindingFlags.Public, null, currentGenerator,
new object[] {});
try
{
q = (IEnumerable)currentGenerator.GetType().InvokeMember
("Next", BindingFlags.InvokeMethod | BindingFlags.Instance |
BindingFlags.Public, null, currentGenerator, new object[] {count});
}
finally
{
currentGenerator.GetType().InvokeMember("TearDown",
BindingFlags.InvokeMethod | BindingFlags.Instance |
BindingFlags.Public, null, currentGenerator, new object[] {});
}
One point to note is that I wrapped the Next(count)
call within a try
/finally
, so we can be assured our clean-up code in TearDown
is always called even if something fails.
The GUI application makes heavy use of reflection, but the whole GUI main form is less than 200 lines. The built-in support for databinding to list-boxes and property-grids made my job really easy.
Current Data Generators
I now need to move all of my sample data generators into this new interface, but I have managed to do the first couple of big ones. I'm certain there will be more to come and I'd be interested in yours should you use this interface and framework. Keep an eye on my blog and I'll also post comments here.
Birthdates (or any date really), you can specify a Format string (based on .NET DateTime
class' format string style), between a MinAge
and MaxAge
.
Lorem Ipsum: You know, that random looking text used as filler when laying out a page. You can specify how many paragraphs to build.
Integers: Numbers, no decimal places between a given Min and Max value.
People's names. The Name generator uses census data to build real world names and salutations, and by using a property called FormatString you can specify exactly how you want the resulting name to look. This makes it easy for me to build names quickly and efficiently in any style or capitalization.
The following placeholders will be replaced by name data:
{0}
= First name Title case{1)
= First name UPPER case(2}
= First name lower case{3}
= Middle name Title case{4)
= Middle name UPPER case(5}
= Middle name lower case{6}
= Middle initial UPPER case{7}
= Middle initial lower case{8}
= Last name Title case{9)
= Last name UPPER case{10}
= Last name lower case{11}
= Salutation Title case{12}
= Salutation UPPER case{13}
= Salutation lower case{14}
= Sex Title case{15}
= Sex UPPER case{16}
= Sex lower case
For example: {11} {9}, {0} {6}
for Mr Troy Thomas Magennis would give:
Mr MAGENNIS, Troy T
This generator is BIG because of the census data that gives real names to choose from. There is a size limit on code project, so you will need to download it from my blog here.
History
- 7 Nov 2006 - Initial Post