(untagged)

How to Implement Full-Text Search in .NET Application with Elasticsearch

Daniele Fontani

0.00/5 (No votes)

6 May 2020

Learn how to read and write documents with custom full-text queries in C# using NEST

In this simple tutorial, I will provide a simple demo to read and write documents to Elasticsearch and add a full-text search feature to C# applications. Elasticsearch is a distributed, open-source search engine that manages all kinds of data. Learn how simple it is to integrate into your ASP.NET application!

Download example code from GitHub

Why Elastic Search?

Elasticsearch is a distributed, open-source search engine that manages all kinds of data. But why is Elasticsearch the best solution for full-text?

I started playing with full-text search on my thesis in 2009 when I had to implement a searching algorithm for a data recommendation system. The experience was very educational, but starting from scratch was a bloodbath. The first opportunity I got to use a library, I tried out Apache Lucene to implement full-text search. As application architecture becomes complex (i.e., multiple servers that need to share indexed data), Solr provided a scalable solution. It is an opensource API-based search engine (based on Lucene). After using these libraries, I used Elasticsearch which is now the market leader. It comes with an on-premise free version and is available on the cloud. It scales from simple installations to huge environments. So the question is: why not?

The Sample Application

I used with success elastic search on production and on my opensource headless cms, RawCMS. To show how Elasticsearch works, I created a sample application where there are the two main features implemented:

Creating an index and adding data into it
Reading data using full-text queries

The example code is available on GitHub. To run and test it, just download, compile, and execute:

dotnet ElasticSearchTest.dll create -f divina_commedia.txt -h http://localhost:9300
> Index created in 30338ms with 14006 element.

dotnet ElasticSearchTest.dll search -i divinacommediatxt 
                                    -h http://localhost:9300 -q "dante AND virgi*"
> Searching for $dante AND virgi*
> "Dante, perché Virgilio se ne vada,
> "Dante, perché Virgilio se ne vada,

The console application uses the library ConsoleLineParser for parsing input in a human way. So I just added two classes for verb attributes and a mapping between the console verbs and the code to run.

private static void Main(string[] args)
{
    object x = CommandLine.Parser.Default.ParseArguments<CreateOptions, SearchOptions>(args)
      .MapResult(
        (CreateOptions opts) => DoCreate(opts),
        (SearchOptions opts) => DoSearch(opts),
        errs => 1);
}

Basing on the verb the user entered (search or create), the corresponding procedure is launched passing the arguments.

How to Write Documents

The writing part is quite easy. In fact, for Elasticsearch, there are two options. Using the low-level framework, you can just get a wrapped implementation of the elastic API. This helps to avoid manual binding and composing JSON payloads. However, if you want to play with data, a good option is NEST. NEST is the high-level framework and, if you are an expert with ORMs, this shouldn’t be any surprise.

You just have to create classes for documents you want to save, and define how to save properties using annotations and invoke the save API.

I don’t think this is too complex, and it sounds more like an ordinary routine. Here is the code snippet for the class definition. In this example, I used just one field per document that contains a line of text.

public class LogDocument
{
    public Guid Id { get; set; } = Guid.NewGuid();
    public string Body { get; set; }
}

The next step is to create the index. In this step, we relate the index with the class. This can be done manually, specifying storing settings, or using automapping. In our example, the “Id” field is automatically mapped to the unique identifier of the document.

client.Indices.Create(indexName, c => c
                .Map<LogDocument>(m => m
                    .AutoMap<LogDocument>()
                )
            );

Finally, we have the most stupid part of the code: writing data. Because we want to save all the poem’s lines into many documents, one per row, we have just an iteration.

string[] lines = File.ReadAllLines(filepath);

int items = 0;
Parallel.ForEach(lines, (line) =>
{
    if (!string.IsNullOrWhiteSpace(line))
    {
        client.CreateDocument<LogDocument>(new LogDocument()
        {
            Body = line.Trim()
        });
        items++;
    }
});

Please note that using the API over an external system, we can use a parallel construct to improve performance.

How to Query Documents

This part is quite simple and very clear — at least I hope. The basic way to access documents is by using regular fluent LINQ syntax. This is the basic usage, and I prefer to focus this demo on the not-so-well- documented use case of searching full-text data.

Elastic allows using a raw query to find over field data. To do this, you can use the right overload of Search method, configuring the raw query:

var searchResponse = client.Search<LogDocument>(s => s
                        .Size(10)
                        .Query(q => q.QueryString(

                            qs => qs.Query(searchStr)
                            .AllowLeadingWildcard(true)
                            )
                        )
                    );

var docs = searchResponse.Documents;

What to Take Home

Elasticsearch is the leading search engine solution. It provides applications rich features like full-text search or document indexing. It can be used as a service or on-premise. In either case, it is quite simple to configure for basic usage.

The NEST framework allows us to store and access Elasticsearch like it was a simple database through LINQ, and this makes everything very simple.

For a complex scenario where you want to let end users write their queries, you can use a raw query and map results to classes.

Resources

History

6^th May, 2020: Initial version

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here