Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Cinchoo ETL - Converting Complex Nested JSON to CSV

4.67/5 (7 votes)
17 Oct 2021CPOL3 min read 13.6K  
Tips to convert complex nested JSON file to CSV format file
In this tip, you will learn how to generate CSV file from JSON format using Cinchoo ETL framework. It is very simple to use, with few lines of code, the conversion can be done. You can convert large files as the conversion process is stream based, quite fast and with low memory footprint.

1. Introduction

ChoETL is an open source ETL (extract, transform and load) framework for .NET. It is a code based library for extracting data from multiple sources, transforming, and loading into your very own data warehouse in .NET environment. You can have data in your data warehouse in no time.

This tip talks about generating CSV file from JSON format using Cinchoo ETL framework. It is very simple to use, with few lines of code, the conversion can be done. You can convert large files as the conversion process is stream based, quite fast and with low memory footprint.

2. Requirement

This framework library is written in C# using .NET 4.5 / .NET Core Frameworks.

3. How to Use

3.1 Sample Data

Let's begin by looking into a simple example of converting the below JSON input file.

Listing 3.1.1. Sample JSON Data Input File (Orders.json)
JavaScript
{
    "system": {
        "created": "2021-08-01T13:33:37.123Z",
        "by": "web"
    },
    "location": {
        "id": 100,
        "country": "DE"
    },
    "order": [
        {
            "OrderID": 22,
            "OrderName": "Soda",
            "OrderArticles": [
                {
                    "Size": 33,
                    "ProductName": "Coke",
                    "ProductId": "999"
                },
                {
                    "Size": 66,
                    "ProductName": "Fanta",
                    "ProductId": "888"
                },
                {
                    "Size": 50,
                    "ProductName": "Pepsi",
                    "ProductId": "444"
                }
            ],
            "ProcessedId": 1001,
            "Date": "2021-08-02"
        },
        {
            "OrderID": 23,
            "OrderName": "Beverage",
            "OrderArticles": [
                {
                    "Size": 44,
                    "ProductName": "Coke",
                    "ProductId": "999"
                }
            ],
            "ProcessedId": 1002,
            "Date": "2021-08-03"
        }
    ]
}

As JSON message is hierarchical and structural format, you will have to flatten out in order to produce CSV file.

Let's say you wanted to produce the below CSV formatted output.

Listing 3.1.2. CSV Data Output File (Orders.csv)
csv
created;by;id;country;OrderID;OrderName;Size;ProductName;ProductId
2021-08-01T13:33:37.123Z;web;100;DE;22;Soda;33;Coke;999
2021-08-01T13:33:37.123Z;web;100;DE;22;Soda;66;Fanta;888
2021-08-01T13:33:37.123Z;web;100;DE;22;Soda;50;Pepsi;444
2021-08-01T13:33:37.123Z;web;100;DE;23;Beverage;44;Coke;999

The first thing to do is to install ChoETL.JSON / ChoETL.JSON.NETStandard nuget package. To do this, run the following command in the Package Manager Console.

.NET Framework

Install-Package ChoETL.JSON

.NET Core

Install-Package ChoETL.JSON.NETStandard

Now add ChoETL namespace to the program.

using ChoETL;

3.2 Quick Conversion

This approach shows how to convert JSON file to CSV format with little piece of code. No setup / POCO class are needed.

Listing 3.2.1. Quick JSON to CSV file conversion
JavaScript
private static void QuickConversion()
{
    StringBuilder csv = new StringBuilder();
    using (var r = new ChoJSONReader("orders.json")
        .Configure(c => c.FlattenNode = true)
        .JsonSerializationSettings(s => s.DateParseHandling = DateParseHandling.None)
        )
    {
        using (var w = new ChoCSVWriter(csv)
            .WithDelimiter(";")
            .WithFirstLineHeader()
            .Configure(c => c.IgnoreDictionaryFieldPrefix = true)
            )
        {
            w.Write(r);
        }
    }
    Console.WriteLine(csv.ToString());
}

Create an instance of ChoCSVWriter for producing CSV (orders.csv) file. Then create an instance of ChoJSONReader object for reading orders.json file. Here turning off datetime parsing and treating as text by using JsonSerializationSettings(s => s.DateParseHandling = DateParseHandling.None). Then on the CSV output end, using IgnoreDictionaryFieldPrefix to true to tell the CSV parse not to add parent node prefix to the output.

Sample fiddle: https://dotnetfiddle.net/VCezp8

3.3 Using POCO Object

This approach shows you how to define POCO entity class and use them for the conversion process. This approach is more type safe and fine control over the conversion process. Once the objects are populated with json values, using LINQ to compose the data for the desired CSV format.

Listing 3.3.1. Mapping Class
JavaScript
public class System
{
    [JsonProperty("created")]
    public string Created { get; set; }

    [JsonProperty("by")]
    public string By { get; set; }
}

public class Location
{
    [JsonProperty("id")]
    public int Id { get; set; }

    [JsonProperty("country")]
    public string Country { get; set; }
}

public class OrderArticle
{
    [JsonProperty("Size")]
    public int Size { get; set; }

    [JsonProperty("ProductName")]
    public string ProductName { get; set; }

    [JsonProperty("ProductId")]
    public string ProductId { get; set; }
}

public class Order
{
    [JsonProperty("OrderID")]
    public int OrderID { get; set; }

    [JsonProperty("OrderName")]
    public string OrderName { get; set; }

    [JsonProperty("OrderArticles")]
    public List<OrderArticle> OrderArticles { get; set; }

    [JsonProperty("ProcessedId")]
    public int ProcessedId { get; set; }

    [JsonProperty("Date")]
    public string Date { get; set; }
}

public class OrderRoot
{
    [JsonProperty("system")]
    public System System { get; set; }

    [JsonProperty("location")]
    public Location Location { get; set; }

    [JsonProperty("order")]
    public List<Order> Orders { get; set; }
}

Then use this class as below to do the conversion of the file.

Listing 3.3.2. Using POCO object to convert JSON to CSV file
JavaScript
private static void UsingPOCO()
{
    StringBuilder csv = new StringBuilder();
    using (var r = new ChoJSONReader<OrderRoot>("orders.json")
        .UseJsonSerialization()
        )
    {
        using (var w = new ChoCSVWriter(csv)
            .WithDelimiter(";")
            .WithFirstLineHeader())
        {
            w.Write(r.SelectMany(root =>
                root.Orders
                .SelectMany(order => order.OrderArticles
                .Select(orderarticle => new
                {
                    created = root.System.Created,
                    by = root.System.By,
                    id = root.Location.Id,
                    order.OrderID,
                    order.OrderName,
                    orderarticle.Size,
                    orderarticle.ProductName,
                    orderarticle.ProductId,
                })
                  )
                )
            );
        }
    }
    Console.WriteLine(csv.ToString());
}

Sample fiddle: https://dotnetfiddle.net/nPxUbA

3.4 Using Projection

This approach shows how to use the LINQ projection method to convert the JSON file to CSV file. As you may know, JSON is a hierarchical, relational, and structured data, and CSV is not. If you have the objects produced from JSON reader in hierarchical format, you must flatten out using LINQ projection and feed them to CSV writer to create the CSV file. The sample below shows how to do it.

Listing 3.4.1. Using Projection to convert JSON to CSV file
JavaScript
private static void UsingProjection()
{
    StringBuilder csv = new StringBuilder();
    using (var r = new ChoJSONReader("orders.json")
        .JsonSerializationSettings(s => s.DateParseHandling = DateParseHandling.None)
        )
    {
        using (var w = new ChoCSVWriter(csv)
            .WithDelimiter(";")
            .WithFirstLineHeader())
        {
            w.Write(r.SelectMany(root =>
                ((Array)root.order).Cast<dynamic>()
                .SelectMany(order => ((Array)order.OrderArticles).Cast<dynamic>()
                .Select(orderarticle => new
                {
                    root.system.created,
                    root.system.by,
                    root.location.id,
                    order.OrderID,
                    order.OrderName,
                    orderarticle.Size,
                    orderarticle.ProductName,
                    orderarticle.ProductId,
                })
                    )
                )
            );
        }
    }
    Console.WriteLine(csv.ToString());
}

Sample fiddle: https://dotnetfiddle.net/VOKRQi

3.5 Using Selection

This approach shows how to use the selection method in combination with flattening node to convert the JSON file to CSV file. As you may know, JSON is a hierarchical, relational, and structured data, and CSV is not. If you have the objects produced from JSON reader in hierarchical format, you must flatten out using FlattenNode to true and then select the desired node to produce the output. The sample below shows how to do it.

Listing 3.5.1. Using Selection to convert JSON to CSV file
JavaScript
private static void UsingSelection()
{
    StringBuilder csv = new StringBuilder();
    using (var r = new ChoJSONReader("sample55.json")
        .WithField("created", jsonPath: "$..system.created", isArray: false)
        .WithField("by", jsonPath: "$..system.by", isArray: false)
        .WithField("id", jsonPath: "$..location.id", isArray: false)
        .WithField("country", jsonPath: "$..location.country", isArray: false)
        .WithField("OrderID")
        .WithField("OrderName")
        .WithField("Size")
        .WithField("ProductName")
        .WithField("ProductId")
        .Configure(c => c.FlattenNode = true)
        .JsonSerializationSettings(s => s.DateParseHandling = DateParseHandling.None)
        )
    {
        using (var w = new ChoCSVWriter(csv)
            .WithDelimiter(";")
            .WithFirstLineHeader())
        {
            w.Write(r);
        }
    }
    Console.WriteLine(csv.ToString());
}

Sample fiddle: https://dotnetfiddle.net/1ZSvsB

For more information about Cinchoo ETL, please visit the below CodeProject article:

History

  • 6th September, 2021: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)