(untagged)

MongoDb 3.2, C# MongoClient 2.x, TextResearch: How To Deal With...

roks nicolas

0.00/5 (No votes)

31 May 2016

For those who deal with Text Research in MongoDb 3.2 with the new C# Driver, here is some advice.

Introduction

Have you ever tried to use Text ReSearch with MongoDb C# Driver 3.2 ? Well, we can say that it's not as easy as it sounds... Here is some advice.

Constructing Text Index

First, for all Text Research, you MUST have an index of type Text on each collection you want to search.

This request will create a Text index "MyFieldTextIndex" for a String Field named "MyField":

MyCollection.Indexes.CreateOne(
              Builders<BsonDocument>.IndexKeys.Text("MyField"),
              new CreateIndexOptions() {DefaultLanguage = "french",Name="MyFieldTextIndex"});

This request will create a Text index "TextIndex" for ALL String fields in "Mycollection":

MyCollection.Indexes.CreateOne(
              Builders<BsonDocument>.IndexKeys.Text("$**"), 
              new CreateIndexOptions() {DefaultLanguage = "french",Name="TextIndex"});

In "DefaultLanguage" parameter, specify your text language will say to mongodb which stopword dictionary to use. Default is "english".

Query Data

Now you can query your collection by using the aggregate pipeline or the standard query pipeline.

How to query and sort by Score:

MyCollection.Aggregate()
            .Match(Builders<BsonDocument>.Filter.Text("firstword secondword"))
            .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
            .ToList());

In this query, mongodb will search "firstword" or "secondword" in all MyCollection's document.

What If You Want to Get the Document Score

MyCollection.Aggregate()
             .Match(Builders<BsonDocument>.Filter.Text("firstword secondword"))
             .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
             .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
             .ToList());

By using the aggregate pipeline, it will project the score into a field "textScore" in new document and only conserve the field "_id" of the "Match" result document...

If you want to get result document AND the score, you should use the standard query pipeline by using Find:

MyCollection.Find(Builders<BsonDocument>.Filter.Text("firstword secondword"))
            .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
            .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
            .ToList());

With the standard pipeline, "Project" method will add a double type field "textScore" at the end of your BsonDocument.

You can combine your TextFilter with other FieldFilter by using a AndFilter or OrFilter but you can't add more than one TextFilter in a query:

MyCollection.Find(
         Builders<BsonDocument>.Filter.And(Builders<BsonDocument>.Filter.Text("firstword secondword"),
                                        Builders<BsonDocument>.Filter.Eq("AnotherField","fieldvalue")))
         .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
         .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
         .ToList());

Query is Ok, it will return all documents that match the TextFilter And where "AnotherField" equals to "fieldvalue".

MyCollection.Find(
         Builders<BsonDocument>.Filter.And
         (Builders<BsonDocument>.Filter.Text("firstword secondword"),
                     Builders<BsonDocument>.Filter.Text("fieldvalue")))
         .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
         .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
         .ToList());

Query is wrong (even if intellisense says ok) and will crash because there are two TextFilters...

Points of Interest

That's all folks!!! I hope these little tips will help you in text research with mongodb.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here