Course/Data Storage/NoSQL Database Types

NoSQL Database Types

Four NoSQL families: document stores (MongoDB), key-value (Redis), wide-column (Cassandra), and graph databases (Neo4j). When to use each.

15 min readHigh interview weight

The NoSQL Landscape

"NoSQL" is an umbrella term for databases that don't use the relational table model. The term is misleading — these databases are not defined by what they *lack*, but by different data models optimized for specific access patterns. There are four major families, each with fundamentally different strengths.

1. Key-Value Stores

The simplest NoSQL model: a distributed hash map. Every record is a key (string) mapped to a value (blob, string, or structured data). Redis and DynamoDB are the most common. Lookups are O(1) and extremely fast. The trade-off: you can only query by key — there is no concept of filtering by value fields.

Database	Value Types	Best For	Persistence
Redis	Strings, lists, sets, hashes, sorted sets, streams	Caching, sessions, leaderboards, rate limiting	Optional (RDB snapshots + AOF logs)
DynamoDB	JSON documents with partition + sort key	High-throughput, serverless, single-table designs	Always durable (SSD-backed)
etcd	Bytes (strings)	Distributed configuration, service discovery (Kubernetes)	Durable (Raft consensus)

2. Document Stores

Document databases store self-describing JSON (or BSON) documents. Unlike key-value stores, you can query and index any field inside the document. MongoDB is the canonical example. Documents can be nested — an order document can embed its line items — which enables fetching a complete entity in a single read without joins.

json

{
  "_id": "order_7821",
  "user_id": "user_42",
  "status": "shipped",
  "created_at": "2024-03-15T10:30:00Z",
  "line_items": [
    { "sku": "WIDGET-A", "qty": 2, "price": 9.99 },
    { "sku": "GADGET-B", "qty": 1, "price": 24.99 }
  ],
  "shipping_address": {
    "city": "New York",
    "zip": "10001"
  }
}

The document model shines when your data is hierarchical and the entity boundary is clear. The key risk is denormalization drift — if line items also appear in an inventory document, updating a price requires updating multiple documents (no foreign key enforcement).

3. Wide-Column Stores

Wide-column stores (Cassandra, HBase, Bigtable) organize data into rows and dynamic columns. Unlike relational tables where every row has the same columns, wide-column stores allow each row to have a different set of columns. The key design concept is the partition key — all data for a partition key lives on the same node, making single-partition queries extremely fast.

Loading diagram...

Cassandra wide-column: rows share a partition key, columns vary per row

Cassandra is optimized for write-heavy, append-heavy workloads with high cardinality. It uses a log-structured merge tree (LSM) internally, making writes extremely fast (sequential disk writes). The trade-off: reads can be slower, and cross-partition queries are expensive or impossible without secondary indexes.

⚠️

Model Your Queries, Not Your Data

In Cassandra, you design tables to match your query patterns — the opposite of relational normalization. If you need to fetch user activity by user and by date, you may need two separate tables. Changing access patterns later requires data migration.

4. Graph Databases

Graph databases (Neo4j, Amazon Neptune) model data as nodes (entities) and edges (relationships), with properties on both. They excel at queries that traverse relationships: "friends of friends", "shortest path between users", "all products bought by customers who also bought X". These queries are extremely difficult in SQL (recursive CTEs) and essentially impossible in document stores.

cypher

// Neo4j Cypher: find friends-of-friends who like the same band
MATCH (me:User {id: 'user_42'})
      -[:FOLLOWS]->(:User)
      -[:FOLLOWS]->(fof:User)
      -[:LIKES]->(b:Band {name: 'Radiohead'})
WHERE NOT (me)-[:FOLLOWS]->(fof)
RETURN fof.name, COUNT(*) AS mutual_friends
ORDER BY mutual_friends DESC
LIMIT 10;

Quick Reference: Choosing a NoSQL Type

Use Case	Best NoSQL Type	Example
User sessions, rate limiting	Key-Value (Redis)	Store JWT token → user data
Product catalog, content CMS	Document (MongoDB)	Flexible schema per product type
Time-series events, IoT telemetry	Wide-Column (Cassandra)	Sensor readings partitioned by device
Social graph, recommendation	Graph (Neo4j)	Friend suggestions, fraud rings
Shopping cart (single user)	Key-Value or Document	Cart keyed by user_id

💡

Interview Tip

Interviewers love the question 'Why not just use MongoDB for everything?' A sharp answer: document stores penalize you when entities reference each other heavily (social feeds, multi-tenant permissions). You end up re-implementing joins in application code. Use the right tool — key-value for ephemeral lookups, wide-column for time-series and write-heavy append logs, graph when relationship traversal is the primary query pattern.

Relational Databases & ACID

Database Sharding & Partitioning