Introduction
I'm writing data import that supports several file formats. I'm testing some libraries. I did not find one that would work properly on all the required formats. Do you have such experiences? So I have good news for you! This tool will solve your problems! -> Fileo (http://fileo.even-simpler.com)
Background
In many projects, one of the features is to import data from files. This process usually looks like this: we load the data from some file, validate and correct data we process.
Developers often have different opinions as to which library is better for loading data from a file. But is that the most important thing? What if the import needed to support different types of files? For example: .csv, .xls, .xlsx.
In my opinion, it does not matter how we load the data but what we will do with them next. That's why I came up with the idea to develop a solution that will allow us to forget about how to load data from file into memory and focus on data processing.
Using the Code
The best way to see how to use this library is to analyze the Fileo.Tests
project included in solution Fileo. There are some examples here.
How to use it?
- Create a
class
with information about imported file structure. Specify the types of the individual columns and set information about validations. For example:
internal class Test1ImportFileStructure : ImportFileStructureBase
{
private readonly string[] _dateTimeFormats = { "dd/MM/yyyy"};
private readonly CultureInfo _cultureInfo = new CultureInfo("en-GB");
private const string CustomRegularExpression = "^(Y|N)+$";
public override IList Columns => new List
{
Column.CreateIdentityStringColumn(Test1Columns.Col1String, 0, 5),
Column.CreateStringColumn(Test1Columns.Col2StringNull, 1, false, 5),
Column.CreateIntColumn(Test1Columns.Col3Int, 2, true),
Column.CreateIntColumn(Test1Columns.Col4IntNull, 3, false, -90, 90),
Column.CreateDateTimeColumn(Test1Columns.Col5DateTime, 4, true, _dateTimeFormats),
Column.CreateDateTimeColumn(Test1Columns.Col6DateTimeNull, 5, false, _dateTimeFormats),
Column.CreateDecimalColumn(Test1Columns.Col7Decimal, 6, true, _cultureInfo, 0, 10),
Column.CreateDecimalColumn(Test1Columns.Col8DecimalNull, 7, false, _cultureInfo, 0, 100),
Column.CreateBoolColumn(Test1Columns.Col9Bool, 8, true),
Column.CreateBoolColumn(Test1Columns.Col10BoolNull, 9, false),
Column.CreateStringColumn(Test1Columns.Col11Email, 10, true, 100, RegexValidation.Email),
Column.CreateStringColumnWithCustomRegex(Test1Columns.Col12CustomRegex, 11,
true, 1, CustomRegularExpression),
};
- Create a
class
which will represent one row from file. For example:
public class Test1ImportResult
{
public string Col1String { get; set; }
public string Col2StringNull { get; set; }
public int Col3Int { get; set; }
public int? Col4IntNull { get; set; }
public DateTime Col5DateTime { get; set; }
public DateTime? Col6DateTimeNull { get; set; }
public decimal Col7Decimal { get; set; }
public decimal? Col8DecimalNull { get; set; }
public bool Col9Bool { get; set; }
public bool? Col10BoolNull { get; set; }
public string Col11Email { get; set; }
public string Col12CustomRegex { get; set; }
}
- Create a generic
class
which will represent all imported data. For example:
public class ImportResult< T>
{
public IList< string> Errors { get; }
public bool HasErrors { get { return !Errors.IsNullOrEmpty() && Errors.Any(); } }
public IList< T> Objects { get; private set; }
public ImportResult(IList< T> objects, IList< string> errors)
{
Objects = objects;
Errors = errors;
}
}
- Create a
class
which will parse data from file to your class
(from point 2). For example:
internal class Test1Import
{
private readonly Table _fileDataInternal;
public Test1Import(Table fileData)
{
_fileDataInternal = fileData;
Validate();
}
public ImportResult< Test1ImportResult> Import()
{
var objects = _fileDataInternal.CorrectRows.Select(row => new Test1ImportResult
{
Col1String = row.GetStringValue(Test1Columns.Col1String, true),
Col2StringNull = row.GetStringValue(Test1Columns.Col2StringNull, false),
Col3Int = row.GetIntValue(Test1Columns.Col3Int),
Col4IntNull = row.GetNullableIntValue(Test1Columns.Col4IntNull),
Col5DateTime = row.GetDateValue(Test1Columns.Col5DateTime),
Col6DateTimeNull = row.GetNullableDateValue(Test1Columns.Col6DateTimeNull),
Col7Decimal = row.GetDecimalValue(Test1Columns.Col7Decimal),
Col8DecimalNull = row.GetNullableDecimalValue(Test1Columns.Col8DecimalNull),
Col9Bool = row.GetBoolValue(Test1Columns.Col9Bool),
Col10BoolNull = row.GetNullableBoolValue(Test1Columns.Col10BoolNull),
Col11Email = row.GetStringValue(Test1Columns.Col11Email, true),
Col12CustomRegex = row.GetStringValue(Test1Columns.Col12CustomRegex, true)
}).ToList();
var result = new ImportResult< Test1ImportResult>(objects, _fileDataInternal.GetErrors());
return result;
}
private void Validate()
{
if (_fileDataInternal == null)
{
throw new ArgumentNullException("fileData");
}
}
}
- Create a
class
which will have methods which will use Fileo to parse file (.csv, .xls, .xlsx) to your class.
public static class Imports
{
public static ImportResult< Test1ImportResult>
ImportTest1(HttpPostedFileBase httpPostedFileBase)
{
var fileData = ImporterHelper.ImportFromFile
(httpPostedFileBase, new Test1ImportFileStructure());
var venuesImport = new Test1Import(fileData);
return venuesImport.Import();
}
}
- Call your methods (from point 5) and you get your data.
Summary
Fileo will greatly simplify the process of data validation because supports many common types of validation. This allows you to focus only on the custom cases.