Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

High Performance Search using MongoDB and ASP.NET MVC

0.00/5 (No votes)
29 Dec 2015 1  
High Performance search using MongoDB and ASP.NET MVC

Introduction

In this post, we are going to discuss a solution for replacing a SQL Server based search with one based on MongoDB. We will build upon the Footloose Financial Services ASP.NET MVC application that was introduced in earlier posts: http://www.codeproject.com/Articles/770341/Embedded-Application-Identity-Part-Basic-Identit.

MongoDB is an implementation of an object-oriented document database which is a flavor of NoSQL databases. NoSQL is an alternative to the traditional relational DMBS that solves several limitations of relational databases although usually at the cost of normalization or referential integrity at the DBMS level. Document databases are a type of NoSQL databases that pair a key with a complex data structure known as a document. A document can contain one or more key-value pairs. NoSQL databases have simpler structures and do not have the overhead of enforcing referential integrity. They are more scalable than relational databases and provide superior performance for searching. 
For the rest of the article, we will be referring to the Footloose Financial Services Visual Studio 2013 project at the following github address: https://github.com/pcarrasco23/FootlooseFinancialServices.

We will be using the code in the commit labeled "Modifications for MongoDB".

MongoDB Installation and Server Architecture

MongoDB should be installed on a server on a network that is accessible by the web server that hosts the ASP.NET application. On my personal installation I installed MongoDB on Ubuntu 14.0.4 LTS 64-bit running in a VMWarePlayer virtual that is on the same virtual network as that of the Windows domain controller and IIS servers.

The MongoDB online manual (http://docs.mongodb.org/manual) has pretty good step-by-step instructions for installing MongoDB on your computer.

Because this particular MongoDB instance is not on the same server as the ASP.NET web server, we will need to configure the instance to allow remote connections. By default, the MongoDB server process will only allow connections from the local loopback. On Linux, the MongoDB server configuration file is /etc/mongodb.conf. Open the file and comment the line that specifies the bind_ip value. This value will filter by IP address the remote connections that MongoDB will permit.

#bind_ip = 127.0.0.1
#port = 27017

For now, by commenting out the bind_ip value, we will permit all remote connections to the MongoDB server. In a subsequent article, I will discuss how to add security to our MongoDB instance including specifying the IP address of the ASP.NET web server in the bind_ip value.

Data Architecture

In the Visual Studio project, we will be changing the back-end data source on the client search page from SQL-Server to MongoDB. However the very first step is deciding on the schema of the document in the MongoDB database. We want to improve performance of the search significantly so that even complex searches with large resultsets return data from the server with sub-second response times. In the SQL Server database, a client (person) has several attributes such as name and email address. This person can have one or more phone numbers, one or more addresses, and one or more accounts. All of this data exists across multiple tables. For the person search we are only concerned with the client's main phone number, address, and account number. Therefore to take advantage of MongoDB's strength (which is searching attributes within a flat document), we will compress each person's hierarchical data into a single record in the MongoDB document. In place of the hierarchical schema of the relational database, we will have a flat document as illustrated below.

This flatted document in a MongoDB database will provide improved search performance for our client search page. This will be done in the ETL process which will be discussed in more detail later.

MongoDB C# Driver and the Service Layer

The MongoDB C# driver can be downloaded from within Visual Studio using NuGet. With the C# driver, you can connect to a MongoDb instance and manipulate data within a document store using C# and LINQ.

The next step after deciding on the schema is to create a physical model based on the schema of the document. In my case, I created a PersonDocument in PersonDocument.cs that I will use as a DTO (Data transfer object) between the service layer code and the MongoDB C# driver. The driver will serialize this object into the BSON format of the MongoDB document store.

public class PersonDocument
 {
 public int PersonID { get; set; }
 public string FirstName { get; set; }
 public string LastName { get; set; }
 public string EmailAddress { get; set; }
 public string PhoneNumber { get; set; }
 public string StreetAddress { get; set; }
 public string City { get; set; }
 public string County { get; set; }
 public string State { get; set; }
 public string Zip { get; set; }
 }

Then I implemented the Unit of Work pattern for the MongoDB database and the Repository<T> interface for MongoDB documents in FootlooseFSDocUnitOfWork.cs and DocumentRepository.cs respectively so that client code can perform CRUD operations on the Person document store in the MongoDB database using the familiar Repository interface.

The Search service method uses the IQueryable interface for searching for Person documents in the SearchPersonDocument method of FootlooseFSService.cs.

IQueryable<PersonDocument> personsQueryable = unitOfWork.Persons.GetQueryable();

The ETL process uses the AddBatch method of the DocumentRepository to add records to the Person document store.

var docUnitOfWork = new FootlooseFSDocUnitOfWork();
 docUnitOfWork.Persons.AddBatch(personDocuments);

Going back to the unit of work class for the MongoDB database, there are a couple of application configuration keys that are necessary for any client (ASP.NET web and ETL process) that uses the MongoDB unit of work and DocumentRepository. For the ASP.NET web application, they would be in the web.config and for the ETL stand-alone program they would be in App.config. The MongoDBConnectionString key contains the MongoDB connection string which is essentially the IP address or DNS of the MongoDB host computer on the network. The MongoDBDatabaseName is the name of the MongoDB database that contains one or more document stores.

<add key="MongoDBConectionString" value="mongodb://192.168.1.6" />
<add key="MongoDBDatabaseName" value="footloosefs"/>

The unit of work class for MongoDB has a private MongoDatabase variable that contains the connection to the database and connection initialization code where we utilize the configuration values. The DocRepository<PersonDocument> object has the code to perform CRUD operations on the person document store. In the Init method, we indicate the fields of PersonDocument that we want serialized to the person MongoDB document. By default, MongoDB will create an object ID field in each document to serve as the primary key unless we indicate otherwise. The MapIdProperty field allows us to specify a field in the DTO as the primary key as I have done with the PersonID field.

public class FootlooseFSDocUnitOfWork
{
        private MongoDatabase _database;

        protected DocRepository<PersonDocument> _persons;

        public FootlooseFSDocUnitOfWork()
        {
            var connectionString = ConfigurationManager.AppSettings["MongoDBConectionString"];
            var client = new MongoClient(connectionString);
            var server = client.GetServer();
            var databaseName = ConfigurationManager.AppSettings["MongoDBDatabaseName"];
            _database = server.GetDatabase(databaseName);            
        }

        public IRepository<PersonDocument> Persons
        {
            get
            {
                if (_persons == null)
                    _persons = new DocRepository<PersonDocument>(_database, "persons");

                return _persons;
            }
        }        

        public static void Init()
        {
            BsonClassMap.RegisterClassMap<PersonDocument>(cm =>
            {
                cm.MapIdProperty(p => p.PersonID);
                cm.MapProperty(p => p.FirstName);
                cm.MapProperty(p => p.LastName);
                cm.MapProperty(p => p.EmailAddress);
                cm.MapProperty(p => p.PhoneNumber);
                cm.MapProperty(p => p.StreetAddress);
                cm.MapProperty(p => p.City);
                cm.MapProperty(p => p.County);
                cm.MapProperty(p => p.State);
                cm.MapProperty(p => p.Zip);
            });
        }
}

ETL Process

Now that we have decided on the layout of the document and add a MongoDB interface to the service layer, we need to populate the MongoDB person document collection with the corresponding data from SQL Server. ETL stands for Extract-Transform-Load. The FootlooseFSDocDBETL project contains code for the process that will extract the necessary data from the Person and associated tables (PersonAccount, PersonAddress, Phone, Address, Account, Login) for each person (represented by a record in dbo.Person) and transform the data into a single object and load the object into the Person document store in the MongoDB data source.

Next Steps

After running the program in the FootlooseFSDocDBETL project, we will have a fully loaded Person document against which our ASP.NET application can run queries. If you have gone this far, you will note that the performance of the MongoDB queries are significantly faster than the SQL Server queries. With a Person document store with 1 million records, searches, paging, and sorting perform within a second. Now imagine 100 or possibly thousands of users searching at the same time! MongoDB can be scaled horizontally to meet the needs of the user demand very easily.

The next step - which is not part of this article - would be to create a process that updates the Person document store in MongoDB whenever a Person or associated record is updated in SQL through the application. This would most likely involve a message with the update written to a queue (such as MSMQ) that is picked up by a Windows service polling the queue and updates the Person document store.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here