In my previous blog post Introduction to NoSQL and Polyglot Persistence, I wrote about what NoSQL databases are, how they can be used, and what the benefits are of using one in your system. Also, different types of NoSQL databases were mentioned as well as their most popular representatives. One of those types is Document NoSQL databases. Document databases are probably closest to relational databases in comparison to all databases in the NoSQL ecosystem. They store documents in collections, a lot like relational databases store rows in tables. But, we’ll get more into that later. One of the most popular representatives of Document databases and one of the leading NoSQL databases is certainly MongoDB.
Core MongoDB Concepts
Before we see the different ways of using MongoDB, let’s go through some of the basic concepts of this database. Guys from MongoDB are very proud of what they call Nexus Architecture. This architecture gives the ability to combine proven concepts and abilities of relational databases with NoSQL innovations.
MongoDB is a document-oriented database, and as already mentioned, it has certain similarities to relational databases. Instead of rows, MongoDB has documents. All the data for a given record is stored in a single document, in difference to relational databases where information for a given record is spread across many tables. Under the hood, documents are BSON files, which are binary-encoded serialization of JSON files. Nevertheless, from a programmer’s point of view, MongoDB manipulates with pure JSON files. For example, we can represent user like this:
{
"_id" : ObjectId("58e28d41b1ad7d0c5cd27549"),
"name" : "Nikola Zivkovic",
"blog" : "rubikscode.net",
"numberOfArticles" : 10,
"Adress" : [
"street" : "some street",
"city" : "Novi Sad",
"country" : "Serbia"
],
"company" : "Vega IT Sourcing",
"expertise" : [".NET", "JavaScript", "NoSQL", "Node.js"]
}
As you can notice, there is a _id
field in the beginning of this JSON document. This field is unique and is generated by MongoDB for every document in the database. This way, MongoDB has kept one of the important properties of relational databases – strong consistency.
Documents are stored inside of collections. Collections are groups of somehow related documents, but these documents don’t need to have the same structure. Here lies one of the biggest benefits of MongoDB. Developers don’t need to know the schema of the database beforehand but can modify schema dynamically during development. This is especially great in the systems where we can’t get the schema quite right in the beginning, or there are plenty of edge cases to cover. Also, this way, the entire problem with impedance mismatch is avoided, i.e., elimination of object-relational mapping layer is eliminated. What does this look like?
Well, let’s say that the previous document is stored in the collection called “users
”. Then we could add another document in that collection, which will contain fields that the previous document doesn’t have and/or won’t have fields that the previous document has. For example, we could add the next document into the collection:
{
"_id" : ObjectId("58e28da0b1ad7d0c5cd2754a"),
"name" : "Vladimir Pecanac",
"blog" : "code-maze.com",
"Adress" : [
"street" : "some street",
"city" : "Novi Sad",
"country" : "Serbia"
],
"company" : "Vega IT Sourcing",
"expertise" : [".NET", "Continuous Integration", "REST"]
"location" : [45, 19]
}
These documents are similar, but not the same. Collection groups them, and gives you the ability to add indexes to these documents. Indexes are one of the concepts that MongoDB inherited from the relational databases in the same form.
It’s important to emphasize some of the other differences between MongoDB and Relational databases. Firstly, MongoDB doesn’t have foreign keys. But, it has a feature that looks quite like that – References. Basically, any object can have a reference to some other object, using its id, but this is not automatically updated, and it’s up to the application to keep track of these connections.
This is done this way due to the fact that once a foreign key is introduced in a relational database, it can be hard to unwind it from it. Thanks to the document data model, and due to the fact that all the necessary information for one “record” is stored inside one document, joins are not provided in MongoDB. However, a similar mechanism called Lookup is available. Among other differences, it should be mentioned that there are no multiple-table transactions in MongoDB.
That is quite a lot of theory, so let’s see how this looks in practice.
Installing and Running MongoDB
MongoDB is free and open-source. It can be downloaded from here, after picking up your operating system.
After installation, a user can run MongoDB server by using the command:
mongod –-dbpath PATH_TO_THE_DIR
Shell would look something like this, after doing so:
As you can see, MongoDB server is initially started on port 27017. This can be changed by using additional parameter --port
:
mongod --port PORT_NUMBER --dbpath PATH_TO_THE_DIR
CRUD Operations From the Shell
Once the MongoDB server is up and running, one can connect to it using Mongo shell client. All the functionalities that are provided in the shell client are provided through MongoDB drivers too. Drivers are provided for all popular programming languages, which means that all features that are shown in this chapter can (and should) be done through the code. However, for demonstrative purposes, shell client is used.
To start shell client, the command mongodb
is used:
At the top of the hierarchy in MongoDB server is database entity. There can be multiple databases on a single server. To create a database command, use DATABASE_NAME
is provided. Also, this command is used when a user wants to switch from one database to another. This command will create a new database if it doesn’t exist, otherwise, it will return the existing database. If one needs to check on which database the client is connected, db
command will display that information. So, let’s create our first database with this command.
Once the database is created and connected to, a user can manipulate the data inside of that database. This is done by using one of the many options of the db entity. All the options will be displayed by typing in db.help()
method.
Now, the first thing a user would need to do is create a collection. This is achieved by using db.createCollection(COLLECTION_NAME, OPTIONS)
method. On the other hand, the list of existing collections can be displayed by using the show collections
command. Existing collections can be removed with db.COLLECTION_NAME.drop()
function. So let’s create and drop one collection, and use show collections
to verify the result.
Once a collection is created, a user can insert, read, update and remove documents from it. For adding a document to the collection, db.COLLECTION_NAME.insert(JSON_DATA)
method is provided. Alternatively, db.COLLECTION_NAME.save(JSON_DATA)
function can be used. This feature is displayed below:
If a user can query documents from the database, functions db.COLLECTION_NAME.find(QUERY)
or db.COLLECTION_NAME.findOne(QUERY)
are provided. Also, by appending pretty()
method, data is presented in a formatted manner. A query that is passed into the function is JSON object too, and MongoDB has an additional query language, that gives a user the ability to do something similar to the where
clause. For example, if we wanted to read all the users that have the blog name – rubikscode.net, we could do something like this:
Every document in the database can be updated. This is achieved by using db.COLLECTION_NAME.update(SELECTION_CRITERIA, UPDATED_DATA)
. Let’s increase the number of articles of our document:
The additional $set
filter was used, which helped us specify exactly which fields inside the document should be updated.
If a user wants to remove a document from the collection, one can do it with db.collection_name.remove(query)
method, as shown in the example below:
To achieve better performances, MongoDB provided indexes. They are not different from traditional indexes from relational databases. However, without them, MongoDB must scan every document from a collection to select those documents that match the query statement. This scan is highly inefficient because of a large volume of data. Indexes are actually data structures which store a small portion of the data that is easily accessible by MongoDB. In order to create an index on a collection, one should run db.COLLECTION_NAME.createIndex({ FIELD_NAME: 1})
. Here’s the example of how it’s done:
There are many other functionalities of the MongoDB language that I haven’t covered here, but these should be enough for the beginning. For more details on the MongoDB API itself, you can check this page.
MongoDB Compass
In order for this database to be more user-friendly, guys at MongoDB have provided a GUI application called Compass. This tool gives simplified data presentation, and gives the ability to a user to query data without knowing details of the MongoDB querying language. This way, the user is able to see databases, collections, and documents in a more human and readable way.
Another very nice feature of this tool is that it’s able to show performances of the server.
Alternatively, a tool called Robo 3T (formerly Robomongo) can be used for similar purposes.
Conclusion
By using the document data model in the right way, MongoDB was able to maintain and balance good relational database features with the innovations that NoSQL databases provided. It is no wonder why it’s one of the most popular NoSQL databases out there. In this article, we just scratched the surface of Mongo’s possibilities. Also, we were more focused on using MongoDB on the document level, so to say. We saw how we can use CRUD operations from the shell, and how to use Compass. In part 2 of MongoDB basics, I covered some of the other MongoDB features, such as replica-sets and sharding.