Skip to content

MongoDB : Introduction

MongoDB is a powerful, open-source NoSQL database designed to manage large-scale data with high performance, scalability, and flexibility.

It utilizes a document-oriented data model, storing information in BSON (Binary JSON) format. This allows for a dynamic, "schema-less" structure, and supports rich set of data types, making it a popular choice for modern applications.

The MongoDB Data Model: Core Concepts

MongoDB's structure is built around four key components that work together to store and organize data.

  • Database: The outermost container, which holds a group of collections. A single MongoDB server can host multiple databases.

  • Collection: A grouping of related documents, analogous to a table in a relational database system (RDBMS). Collections do not enforce a strict schema.

  • Document: The basic unit of data in MongoDB, equivalent to a row in an RDBMS. Documents are BSON objects composed of field-value pairs.

  • Field: A key-value pair within a document, similar to a column in an RDBMS.

Example of a MongoDB Document: This document contains various fields, including a nested object (contact) and an array (skills).

json
{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "name": "Alice",
  "age": 25,
  "skills": ["Python", "MongoDB"],
  "contact": {
    "email": "alice@example.com",
    "phone": "9876543210"
  }
}

Document-Oriented Approach

Document-based NoSQL systems, like MongoDB, CouchDB, and RethinkDB, store data in self-contained, structured documents. This model offers significant advantages in flexibility and ease of use.

Key Characteristics:

  • Flexible Schema: Each document can have a unique structure. Fields can be added or removed on the fly without affecting other documents in the same collection.

  • Self-Describing Data: Because each document stores field names (keys) alongside their values, the data's structure is immediately understandable.

  • Indexing on Any Field: The system can create indexes on any element within a document, enabling fast and efficient queries.

  • Logical Grouping: Documents are organized into collections, providing a logical way to group similar data.

Key Features of MongoDB

MongoDB provides a rich set of features that make it a robust and versatile database solution.

  1. Dynamic Schema & Document-Oriented Storage MongoDB uses BSON to store documents with no predefined schema, making it powerful and flexible. Each document is a structure of key-value pairs, capable of containing nested documents and arrays, allowing for complex, hierarchical data relationships within a single record.

  2. High Performance The database is optimized for high-speed read and write operations. Performance is further enhanced through comprehensive indexing support and the ability to process data in-memory.

  3. Horizontal Scalability (Sharding) MongoDB achieves horizontal scalability through sharding, a process that distributes large collections across multiple servers (or "shards"). This "scale-out" architecture allows to increase capacity by simply adding more machines.

  4. High Availability (Replication) High availability is provided via Replica Sets, which are clusters of MongoDB servers that maintain identical copies of the data. One server acts as the primary node (handling writes), while others serve as secondary nodes (for reads and automatic failover), ensuring data redundancy and system uptime.

  5. Powerful Indexing Any field in a document can be indexed to improve query performance. MongoDB supports a wide variety of index types, including single-field, compound, text, geospatial, and hashed indexes.

  6. Rich Query Language MongoDB provides a robust query language that allows for filtering, sorting, and projecting data. It supports a wide range of query types, including queries by field, range queries, and pattern matching.

  7. Advanced Aggregation Framework The aggregation framework allows for powerful data transformation and analysis through a multi-stage pipeline. It can perform complex operations similar to SQL’s GROUP BY and HAVING clauses, enabling you to process and compute results directly within the database.

Supported Data Types (BSON)

MongoDB leverages BSON to support a wider range of data types than standard JSON.

Data TypeDescriptionExample
StringUTF-8 encoded text"name": "Alice"
Number32-bit/64-bit integer or double-precision floating-point"age": 30
Booleantrue or false value"isStudent": false
ArrayOrdered list of values of any type"skills": ["Java", "C++"]
ObjectAn embedded document"address": {"city": "New York"}
NullRepresents a null or non-existent value"spouse": null
ObjectIdA 12-byte unique ID automatically assigned by MongoDB"_id": ObjectId("...")
DateStores the current date or time in UTC"joined": ISODate("2025-07-01")
TimestampA 64-bit value used internally by MongoDBFor internal replication and sharding
Binary DataStores binary data, such as an image or a file"image": BinData(...)
CodeStores JavaScript code as a string"myFunction": Code("function() { ... }")
Decimal128A 128-bit decimal for high-precision monetary calculations"price": NumberDecimal("99.99")
RegExpStores a regular expression"email": /@example\.com$/

Working with MongoDB

1. Collections

In MongoDB, a collection is a grouping of MongoDB documents. It is the equivalent of a table in a relational database system.

Creating a Collection

You can create a collection using the db.createCollection() command.

javascript
db.createCollection("project", { capped: true, size: 1310720, max: 500 })
  • "project" is the name of the new collection.
  • The second argument is an optional object specifying the collection's properties:
    • capped: true: This makes the collection a capped collection, which is a fixed-size collection that automatically overwrites its oldest entries when it reaches its maximum size.
    • size: 1310720: This specifies the maximum size of the collection in bytes (in this case, 1.3 MB).
    • max: 500: This specifies the maximum number of documents the collection can hold.

2. Documents and the _id Field

Each document within a collection is required to have a unique _id field, which acts as its primary key.

The Unique _id Identifier

  • Uniqueness: Every document must have a unique _id.
  • Generation: The _id can be either user-defined or automatically generated by MongoDB if not provided.
  • Indexing: The _id field is automatically indexed, which allows for fast retrieval of documents.

Structure of a System-Generated _id:

A system-generated _id is a 12-byte ObjectId value composed of:

  • 4-byte timestamp: The time the document was created.
  • 3-byte machine identifier: A unique identifier for the machine where MongoDB is running.
  • 2-byte process ID: The ID of the process that generated the ObjectId.
  • 3-byte counter: A counter that starts with a random value.

3. Schema Design

MongoDB collections do not enforce a strict schema, Unlike relational databases. This flexibility allows for storing documents with different fields and structures within the same collection.

You can choose between two primary design patterns: normalized and denormalized.

Normalized Design

In a normalized design, related data is stored in separate collections and linked using references, typically by storing the _id of one document in another. This is similar to the design of relational databases using primary keys.

Denormalized (Embedded) Design

In a denormalized design, related data is embedded directly within a single document. This approach is often preferred in MongoDB as it allows for faster data retrieval by avoiding the need for joins.

In this example, the user's address and hobbies are embedded directly within the user document.

json
{
  "_id": 1,
  "name": "Alice",
  "age": 21,
  "email": "Alice@example.com",
  "isStudent": false,
  
  "skills": ["MongoDB", "Node.js"], 
  // Array (multivalued field)
  
  "address": { // Embedded document (composite field)
    "street": "123 Alice Street",
    "city": "Alice City",
    "zip": "560054"
  },
  
  "hobbies": ["acting", "reading"], 
  // Array (multivalued field)
  
  "registeredOn": ISODate("2025-07-01T10:00:00Z"), 
  // Date
  
  "salary": NumberDecimal("75000.00") 
  // Decimal number
}

Made with ❤️ for students, by a fellow learner.