Categories of NoSQL Systems
NoSQL databases emerged to address the scalability, performance, and flexibility limitations of traditional relational databases, especially when dealing with big data, semi-structured/unstructured data, and real-time processing needs.
NoSQL systems are classified based on data models, storage structures, and query capabilities.
The four primary categories are:
1. Document-Based NoSQL Systems
These systems store data as collections of similar, self-describing documents, often using formats like JSON (JavaScript Object Notation) or BSON (Binary JSON). Schemaless, ideal for evolving data structures.
A major difference from object or XML systems is that there is no requirement to specify a schema; documents can have different data elements. Users can specify a partial schema to improve storage efficiency, but it's not mandatory.
Documents are accessible via a unique document ID (e.g., _id
in MongoDB), and can also be rapidly accessed using other secondary indexes based on document content.
- Examples: MongoDB, CouchDB
2. Key-Value Stores
These systems utilize a simple data model based on fast access to a value given its unique key. The key is a unique identifier, and the value can be a string of bytes, a record, an object, or even a document
They prioritize high performance, availability, and horizontal scalability. Many key-value stores do not have a query language, providing instead a set of operations via a programming API
Use Case: High-speed lookups, session storage, caching.
Examples: DynamoDB, Apache Cassandra, Redis, Oracle NoSQL Database
3. Column-Based or Wide-Column Stores
These systems partition a table by column into column families, where each column family is stored in its own files. This is a form of vertical partitioning.
They typically use a multidimensional key consisting of components like table name, row key, column family, column qualifier, and timestamp. The data model includes storage-related concepts and allows for versioning of data values with timestamps.
Scalability: Optimized for large-scale, write-heavy workloads.
Examples: Apache HBase, Google Bigtable, Apache Cassandra
4. Graph-Based NoSQL Systems
Data is represented as a graph, consisting of nodes (vertices) and directed edges (relationships). Both nodes and edges can have labels (or types) and properties (data items) associated with them.
Related nodes can be found by traversing edges using path expressions.
Use Case: Ideal for highly connected data, such as social networks, recommendation systems, and network analysis.
Querying: Uses path traversal or graph-specific languages like Cypher.
Examples: Neo4j, Amazon Neptune, GraphBase
Other Categories of NoSQL Systems
In addition to the four main types, some systems span across multiple categories or support specialized models:
5. Hybrid NoSQL Systems
Combine features of two or more types (e.g., document + graph).
- Example: OrientDB supports document and graph models.
6. Object-Oriented Databases
Store data as objects, closely aligning with object-oriented programming. Support object identities, inheritance, and encapsulation.
Examples: db4o, ObjectDB, Versant
7. XML Databases
Designed to store, query, and manage XML documents. Useful when XML is used as a standard data interchange format.
Examples: BaseX, eXist-db
Google developed a proprietary NOSQL system known as BigTable, Apache Hbase is an open source NOSQL system based on similar concepts. led to the category of NOSQL systems known as column-based or wide column stores;
Amazon developed a NOSQL system called DynamoDB that is available through Amazon’s cloud services. This innovation led to the category known as key-value data stores or sometimes key-tuple or key-object data stores. DynamoDB, Oracle key-value store. Redis key-value cache and store.
Facebook developed a NOSQL system called Cassandra, which is now open source and known as Apache Cassandra. This NOSQL system uses concepts from both key-value stores and column-based systems.
Categories of NoSQL Systems
Category | Data Model | Best Use Case | Examples |
---|---|---|---|
Document-Based | JSON, BSON | Content management, e-commerce | MongoDB, CouchDB |
Key-Value Store | Key-value pairs | Caching, session management | Redis, DynamoDB, Riak |
Column-Based | Column families | Data warehousing, big data analytics | HBase, Cassandra, Bigtable |
Graph-Based | Nodes and Edges | Social graphs, fraud detection, recommendation engines | Neo4j, GraphBase |
Hybrid Systems | Mixed | Flexible data handling | OrientDB |
Object Databases | Objects | OOP persistence, complex applications | db4o, ObjectDB |
XML Databases | XML structure | Document-centric applications | BaseX, eXist-db |