Attributes and Reflection
Simply put, an attribute is a mechanism by which it is possible to add metadata to various program elements. Reflection is a means by which this metadata can be accessed. The metadata information gets stored in the assembly.
Using these features, we shall strive to develop a generic base plate class structure that would allow us to import any type of CSV file. For this article, we shall be using a sample CSV file, Employees.csv, which has the following information in each line.
<Name>, <Location>, <Date of Birth>
Our class design allows us enough flexibility so that if even if this structure changes drastically, we would be able to maintain and incorporate the change with minimal effort. It should be simple to define and create, and easy to maintain. Added to this, basic validation support should be provided for the imported fields.
One class to define it all
And, how do we go about this? In a nutshell, we develop custom attributes and apply them to the properties in the Employee
class we model, and then using Reflection, we enable ourselves to transform the file data into a list of Employee
objects. First, let us have a peek at the final Employee
class we will be using.
[ImportFile(FileType = ImportFileType.CSV)]
public class Employee
{
#region Properties and fields
public const int NAME_INDEX = 0;
public const int ROLE_INDEX = 1;
public const int DOB_INDEX = 2;
private string _name;
[ImportField(
NAME_INDEX,
EnableTrimming=true,
EnableValidation=true,
ValidationPattern=@"^([ '-a-zA-z])+$")]
public string Name
{
get { return _name; }
set { _name = value; }
}
private string _role;
[ImportField(ROLE_INDEX)]
public string Role
{
get { return _role; }
set { _role = value; }
}
private DateTime _dob;
[ImportField(
DOB_INDEX,
EnableTrimming=true,
DataType=DataType.DateTime)]
public DateTime Dob
{
get { return _dob; }
set { _dob = value; }
}
#endregion
#region Ctor
public Employee(string name)
{
_name = name;
}
public Employee()
{
_name = "";
_role = "";
_dob = new DateTime();
}
#endregion
#region Methods
public override string ToString()
{
return String.Format(
"Name:'{0}'\nRole:'{1}'\nBorn:{2:dd-MM-yyyy}\n",
_name, _role, _dob);
}
#endregion
}
If you look at this class, you will notice that it has enough information within it (metadata) to completely describe the CSV file record it is trying to model. Let us examine this class in some detail. First, take a look at the Name
property.
private string _name;
[ImportField(
NAME_INDEX,
EnableTrimming=true,
EnableValidation=true,
ValidationPattern=@"^([ '-a-zA-z])+$")]
public string Name
{
get { return _name; }
set { _name = value; }
}
It has an attribute ImportField
, which has some parameters following it. The first one is NAME_INDEX
, and corresponds to the position of this field in the import file. The next one, EnableTrimming
, is set to true
(indicating that the value in this field will be trimmed). The third parameter, EnableValidation
, is set to true
and a validation pattern Regular Expression is provided to validate the Name
field. Thus, the ImportField
attribute associated with the Name
property adds on enough information for the field to be properly fetched from a CSV file. Using Reflection, this attribute will be used to perform the actual operation of loading the fields in the file for a record into an object of Employee
.
Looking at this class, we can see how easy it is if a new field gets added to the CSV file. A new property has to be defined in the Employee
class, with the proper ImportFieldAttribute
assigned to it. If fields get swapped around, only the position parameter needs to get changed, which is really simple to do.
Getting to the root of it
Now, ImportField
is a custom attribute that specifies that a corresponding property has an associated field in the CSV file, as well as specifies semantics about whether the data is to be trimmed on loading, whether it needs to be validated, and what is a valid format. ImportField
is essentially implemented as a subclass of System.Attribute
. (Note that the name of the class is ImportFieldAttribute
and not ImportField
; the compiler automatically appends Attribute
if it does not find a definition for FieldImport
.) This attribute class defines various properties that define a field in the CSV file.
[AttributeUsage(AttributeTargets.Property, AllowMultiple = false)]
public class ImportFieldAttribute : Attribute
{
#region Properties and fields
private int _position;
public int Position
{
get { return _position; }
set { _position = value; }
}
private string _validationPattern;
public string ValidationPattern
{
get { return _validationPattern; }
set { _validationPattern = value; }
}
private bool _enableValidation;
public bool EnableValidation
{
get { return _enableValidation; }
set { _enableValidation = value; }
}
private bool _enableTrimming;
public bool EnableTrimming
{
get { return _enableTrimming; }
set { _enableTrimming = value; }
}
private DataType _dataType;
public DataType DataType
{
get { return _dataType; }
set { _dataType = value; }
}
#endregion
#region Ctor
public ImportFieldAttribute(int position)
{
this._position = position;
this.DataType = DataType.String;
}
#endregion
}
You will notice that various parameters that we have set before are actually the properties in the attribute class we have defined. However, the position parameter is passed in explicitly, as it is required because of the class constructor. All the other parameters are optional, and if specified, requires the property name to be used.
[ImportField(
NAME_INDEX, // required, and notice no property name
EnableTrimming=true, // optional, so provide the name also
EnableValidation=true,
ValidationPattern=@"^([ '-a-zA-z])+$")]
Once we compile our Employee
and ImportFieldAttribute
classes, the complier automatically assigns the attributes to the properties in the Employee
class. Later on, using Reflection, we will be able to get the extended information provided by these attributes for each of the properties in the Employee
class.
Another key line to note is the very first line in the attribute class definition.
[AttributeUsage(AttributeTargets.Property, AllowMultiple = false)]
This line is an attribute stating that our custom attribute (ImportFieldAttribute
) can only have properties as targets and multiple associations to the same property are not allowed. The elements to which an attribute is associated with is called a target; attributes can have many types of targets including classes, methods, parameters, fields, return values from methods, the entire assembly as a whole, and so on. More information can be found here.
Another attribute that was used on the Employee
class, as you may have noticed, was the class-level attribute called ImportFile
.
[ImportFile(FileType = ImportFileType.CSV)]
This is used to specify that the class contains data to be filled up from a CSV file. This is specified in this manner so that later on, when a new type of file needs to be imported, we can reuse much of our existing code and implement only very specific classes and/or methods to do the actual import. For the moment, the import file type can only be CSV.
That said, we are at a stage where we have a class that represents an Employee, which specifies how the employee data present in a CSV file relates to properties in the Employee file. All we need to do now is find a mechanism to do the actual import. Before we get into the core details of the implementation, note that we have a main class ImportFileManager
that performs all the gritty details of the importing process. A sample snippet will, however, let you know how easy it becomes to process CSV files finally:
ImportFileManager<Employee> fileImporter =
new ImportFileManager<Employee>("Employees.csv");
List<Employee> list = fileImporter.Import();
foreach (Employee employee in list)
{
Console.WriteLine(employee);
}
You create a new ImportFileManager
object, passing along the name of the file we want to import, and since it is a generic class, the Employee
type has to be specified along too. ImportFileManager
will use Reflection to determine what type of file we are processing based on the generic type's attributes (in this case, our type is Employee
, and we had used the ImportFile
attribute on it to mark its data source as a CSV file; this information will be picked up by the ImportFileManager
class to perform the appropriate import process).
The details
Now, let us proceed to get into more details about each of these classes.
public class ImportFileManager<EntityClass>
where EntityClass : class, new()
{
#region Properties and fields
private EntityClass entity;
private ImportFileAttribute importFileSettings;
private FileImporter<EntityClass> _importer;
public FileImporter<EntityClass> Importer
{
get { return _importer; }
set { _importer = value; }
}
private string _fileName;
public string FileName
{
get { return _fileName; }
set { _fileName = value; }
}
#endregion
#region Ctor
public ImportFileManager(string fileName)
{
_fileName = fileName;
entity = new EntityClass();
importFileSettings =
ReflectionHelper.GetImportFileAttribute(entity);
_importer =
FileImporterFactory<EntityClass>.
CreateFileImporter(
_fileName,
importFileSettings.FileType);
}
#endregion
#region Methods
public List<EntityClass> Import()
{
return _importer.Import();
}
#endregion
}
The ImportFileManager
class listing is shown above. Of interest is the constructor, which does mainly two things:
- Gets the
ImportFile
attribute settings we have specified for the generic type (Employee
class, in our case). For this, it calls on a static method in the ReflectionHelper
class.
public static ImportFileAttribute
GetImportFileAttribute(object entity)
{
object[] attributes =
entity.GetType().GetCustomAttributes(false);
foreach (object attribute in attributes)
{
if (attribute is ImportFileAttribute)
{
return (ImportFileAttribute)attribute;
}
}
return null;
}
Next, it determines which FileImporter
sub-class to use to perform the actual import.
_importer =
FileImporterFactory<EntityClass>.
CreateFileImporter(
_fileName,
importFileSettings.FileType);
The listing for FileImporterFactory
is shown below:
class FileImporterFactory<EntityClass>
where EntityClass : class, new()
{
#region Methods
public static FileImporter<EntityClass>
CreateFileImporter(
string fileName,
ImportFileType fileType)
{
switch (fileType)
{
case ImportFileType.CSV:
return
new CsvFileImporter<EntityClass>(fileName);
default:
throw
new ArgumentException(
"The import file type is not supported."
);
}
}
#endregion
}
The purpose of the FileImporterFactory
class is to determine what type of import mechanism we need to use, and this decision is based on the file type enumeration that was specified in the ImportFile
attribute of the Employee
class.
You may argue that all this abstraction adds a great deal of complexity and overhead and could lead to performance degradation. I agree. When compared to a more direct approach of opening the file and processing directly, the method and approach that is defined and presented here will be slow. However, the gains in terms of outright flexibility, ease of use, and maintainability would be significant. Finally, it all boils down to the actual performance requirements and implementation details, and a trade-off will have to be taken.
The FileImporter
class is listed below.
public abstract class FileImporter<EntityClass>
where EntityClass : class, new()
{
#region Properties and fields
private string _fileName;
public string FileName
{
get { return _fileName; }
set { _fileName = value; }
}
private List<string> _errorRecords =
new List<string>();
public List<string> ErrorRecords
{
get { return _errorRecords; }
set { _errorRecords = value; }
}
public bool ImportSuccess
{
get
{
return (_errorRecords.Count == 0);
}
}
#endregion
#region Ctor
public FileImporter(string fileName)
{
this._fileName = fileName;
}
#endregion
#region Methods
public abstract List<EntityClass> Import();
#endregion
}
A concrete implementation of FileImporter
in the form of CsvFileImporter
is listed below:
class CsvFileImporter<EntityClass> : FileImporter<EntityClass>
where EntityClass : class, new()
{
#region Properties and fields
private ImportFileAttribute importFileSettings;
#endregion
#region Ctor
public CsvFileImporter(string fileName)
: base(fileName)
{
EntityClass fileRecord = new EntityClass();
importFileSettings =
ReflectionHelper.GetImportFileAttribute(fileRecord);
}
#endregion
#region Methods
public override List<EntityClass> Import()
{
string recordData;
string[] dataElements;
EntityClass fileRecord;
List<EntityClass> theList = new List<EntityClass>();
StreamReader streamReader = File.OpenText(base.FileName);
while (!streamReader.EndOfStream)
{
recordData = streamReader.ReadLine();
try
{
dataElements =
recordData.Split(
importFileSettings.FieldDelimiter.ToCharArray());
fileRecord = new EntityClass();
for (int i = 0; i < dataElements.Length; i++)
{
ReflectionHelper.SetPropertyValue(
fileRecord,
dataElements[i],
i);
}
theList.Add(fileRecord);
}
catch (FieldValidationException)
{
ErrorRecords.Add(recordData);
}
}
streamReader.Close();
return theList;
}
#endregion
}
And finally, the RefectionHelper
class, which provides all Reflection based functionalities:
public class ReflectionHelper
{
public static ImportFileAttribute GetImportFileAttribute(
object entity)
{
object[] attributes =
entity.GetType().GetCustomAttributes(false);
foreach (object attribute in attributes)
{
if (attribute is ImportFileAttribute)
{
return (ImportFileAttribute)attribute;
}
}
return null;
}
public static void SetPropertyValue(
object entity,
object value,
int fieldIndex)
{
object[] attributes;
foreach (PropertyInfo property in
entity.GetType().GetProperties())
{
attributes = property.GetCustomAttributes(
typeof(ImportFieldAttribute), false);
foreach (object attribute in attributes)
{
ImportFieldAttribute field =
(ImportFieldAttribute)attribute;
if (field.Position == fieldIndex)
{
if (IsFieldValueValid(field, value))
{
value = PrepareFieldValue(
field,
property,
value);
property.SetValue(entity, value, null);
}
else
{
throw new FieldValidationException(
string.Format(
"Validation of field '{0}' failed, value "+
"'{1}' should match pattern '{2}'",
property.Name,
value,
field.ValidationPattern)
);
}
}
}
}
}
public static bool IsFieldValueValid(
ImportFieldAttribute field,
object value)
{
if (field.EnableValidation &&
field.ValidationPattern != null &&
field.ValidationPattern.Length > 0)
{
if (Regex.IsMatch((string)value, field.ValidationPattern))
{
return true;
}
else
{
return false;
}
}
return true;
}
public static object PrepareFieldValue(
ImportFieldAttribute field,
PropertyInfo property,
object value)
{
if (field.EnableTrimming)
{
value = ((string)value).Trim();
}
if (value is IConvertible && field.DataType!=DataType.String)
{
value = Convert.ChangeType(value, property.PropertyType);
}
else
{
}
return value;
}
}
We have almost come to the end of this article. Along the way, we looked at a generic design for importing CSV files. Using features like Reflection, attribute-based programming, generic types, and abstract classes, we have seen how to develop a flexible solution. The idea presented in this article is in no way perfect, and there are elements of design that could definitely take a re-look. That said, I hope I have conveyed the bigger picture of trying to design classes that allow flexible usage and adapts to various scenarios. There may be a better way to implement the same, so do leave a comment on what you feel and how you can improve upon this. Download the sample files and try them out, and come up with your own wacky ideas and design paradigms.
The actual import logic for CSV files is implemented in the CsvFileImporter
class which is a derived class of the abstract FileImporter
class. FileImporter
simply defines an abstract method called Import
which should be implemented by any derived class (and therefore, CsvFileImporter
implements the same). Using this approach, we need not worry about the implementation details about which class is to be utilised and what logic is used for performing the import. All this is abstracted from the caller, and the only knowledge required is that by calling the Import
method, data will get imported as model objects automatically. Thus, by invoking the Import
method in the FileImportManager
class, we end up with a list of Employee objects that are present in the CSV file.
I work as a Technology Lead for an IT services company based in India.
Passions include programming methodologies, compiler theory, cartooning and calligraphy.