MongoDB Series - Part 1
Introducing MongoDB
MongoDB is an open source document database. What does that mean? Well, a document database is a specific type of database that’s based on a document, as a typical data type. Everything you need to know about a particular entity is aggregated into a single document. It’s cross-platform; whether you’re a Linux or a Windows user, you can install MongoDB on your operating system, and work in the environment that you’re already most comfortable with.
MongoDB is a cross-platform, open-source document database.
It’s classified as a NoSQL database. NoSQL has been around for a few years, and it’s a departure from the traditional relational database management system. First and foremost, we don’t have tables and columns, and we don’t have tables that are related to each other with foreign keys and primary keys. And we don’t use Structured Query Language, or SQL, to communicate with the database. It uses, instead, a document data model. As mentioned, a document is a single aggregate that brings together all the information relevant to a particular entity. Each record is a document. Now if you look at the somewhat trivial example shown below here, this is an employee. In a single document, we have their name, their age, and their employee type.
Now of course, a real employee document might have a great deal more information than this. For example, phone number, office, e-mail, hire date, educational background, current pay rate – all those sorts of things. But the important point is it’s all in a single document. So if you simply retrieve that document, you have all the information needed for that particular employee. Now an employee is just one example. Certainly you can store any type of data you wish in a document database.
MongoDB is classified as a NoSQL database. A record is a document. An example of a record is each line is composed of a field: value pair.
It’s composed of field value pairs, and that’s what you see on the screen here. The field is name, the value is John. The next field is age, the value is 47, and so forth. Those values that you see on the example currently are simple data types; a string or an integer. But they could be other documents. In other words, you could have a field that says, perhaps, office. And it’s another document that it points to. And that document gives all the details of that office; what building, its address, hours it’s open, that sort of thing. Fields can also have values that are arrays of data – multiple and array of integers, multiple strings, that sort of thing. In fact, a field can have a value that’s an array of other documents. Now particularly with that last example, just mentioned – an array of documents – these things can get very complex. It’s a very robust data format. There are some advantages to using the document approach.
First of all, and probably most importantly, you reduce or eliminate the need for expensive joins. (Now by expensive, is not talking about a monetary expense; It’s about processor, and resources, and memory. If you’re used to a RDBMS approach, then you are probably accustomed to data being distributed in multiple tables.)
Values can include other documents, arrays, and arrays of documents. An advantage is the reduced need for expensive joins.
And when you want data, for example on an employee, you may have to do inner joins, or outer joins, or left joins, or right joins, or all sorts of joins, that bring data from multiple tables together. Well, those tend to be resource intensive. And if they’re resource intensive, that means they slow down queries. The document approach reduces, or in many cases, eliminates the needs for those, and thus improves our response time. There’s a dynamic schema. Unlike with the relational database model – where you have to establish your schema prior to entering data – with the document approach, you can add new documents that might have more or fewer field value pairs at any point. The schema is flexible, and can be determined on the fly, as you’re working with the data. This corresponds more directly to native data types. It looks like the regular data. SQL and relational database management systems have a tendency to have rather artificial looking data; the underlying structure in the database doesn’t really match with what we work in the real world. Document data types, document databases – because we put everything together in a single aggregate that represents an entity; more closely corresponds to real world data usage. Other advantages include the use of a dynamic schema and correspondence to native data types.