Index Table Pattern
Create secondary indexes as separate tables: enabling efficient queries on non-primary-key fields in NoSQL and sharded databases.
The Problem: Querying Without a Primary Key
In relational databases, a secondary index is a first-class feature managed automatically by the database engine. But in NoSQL databases (DynamoDB, Cassandra, Redis) and sharded databases, lookups by non-primary-key attributes are expensive because data is partitioned by the primary key. Fetching a user by email when your primary key is `user_id` requires scanning all partitions.
The Index Table pattern solves this by manually creating a separate table whose primary key is the attribute you want to query. This secondary table stores the mapping from the query attribute to the primary key, enabling O(1) lookups without full table scans.
Index Table Structure
# Main table: users (primary key = user_id)
# DynamoDB put:
users_table.put_item(Item={
"user_id": "u-123",
"email": "alice@example.com",
"name": "Alice"
})
# Index table: users_by_email (primary key = email)
# Maintained manually alongside the main table
users_by_email_table.put_item(Item={
"email": "alice@example.com",
"user_id": "u-123"
})
# Lookup by email:
def get_user_by_email(email: str) -> dict:
# Step 1: Resolve primary key via index table
index_row = users_by_email_table.get_item(
Key={"email": email}
).get("Item")
if not index_row:
return None
# Step 2: Fetch full record from main table
user = users_table.get_item(
Key={"user_id": index_row["user_id"]}
).get("Item")
return userThree Index Table Variants
| Variant | What It Stores | Trade-off |
|---|---|---|
| Key-only index | Secondary key → primary key mapping only | Two reads per lookup; minimal storage overhead |
| Covering index (denormalized) | Secondary key + all commonly queried attributes | One read per lookup; data duplication; must keep in sync |
| Composite index | Multi-attribute secondary key (e.g., city + last_name) | Enables compound queries; less flexible for partial filters |
Keeping Index Tables Consistent
The index table must always reflect the current state of the main table. Writes become two-phase operations: write to both the main table and the index table. This introduces two consistency challenges:
- Atomicity: A crash after updating the main table but before the index table leaves them out of sync. Use a transactional outbox or two-phase commit if your datastore supports it (DynamoDB TransactWriteItems).
- Key updates: If the indexed attribute changes (e.g., a user changes their email), you must delete the old index entry and insert a new one. Failure here leaves orphaned index entries.
DynamoDB Global Secondary Indexes
DynamoDB's managed Global Secondary Indexes (GSIs) implement the Index Table pattern automatically. Under the hood, DynamoDB maintains a separate partition for each GSI and asynchronously propagates writes. The trade-off: GSI reads are eventually consistent by default, and GSI write throughput is charged separately.
Index Table vs Full-Text Search
Index tables work well for exact-match and range queries on structured attributes. For full-text search, prefix matching, or fuzzy matching, a dedicated search engine (Elasticsearch, Algolia, Typesense) is more appropriate. The index table and search engine patterns are complementary — many systems use both.
Interview Tip
Index tables come up in interviews whenever you propose a NoSQL database and the interviewer asks 'but how do you look up users by email?'. The answer is: 'I'd maintain a secondary index table mapping email to user_id. Every write updates both tables transactionally.' Then address the consistency concern proactively. This demonstrates that you think about operational correctness, not just the happy path.