Skip to content
~/sph.sh

DynamoDB Single-Table Design: A Comprehensive Modeling Guide

Master DynamoDB single-table design with practical patterns for modeling relationships, choosing between GSI and LSI, optimizing with DAX, and avoiding common pitfalls in production NoSQL systems.

Abstract

Single-table design represents a fundamental shift in how we model data for DynamoDB. This comprehensive guide explores when to use single-table patterns, how to model one-to-one, one-to-many, and many-to-many relationships, the trade-offs between Global and Local Secondary Indexes, DAX caching integration, and practical query optimization techniques. You'll find working TypeScript examples, real-world cost analyses, and battle-tested patterns for avoiding hot partitions and throttling issues.

Why Single-Table Design Matters

Working with DynamoDB taught me that thinking in terms of relational tables causes more problems than it solves. The typical approach - creating separate tables for Users, Orders, Products - leads to multiple round-trips, complex application logic, and unpredictable costs.

Single-table design stores multiple entity types in one table using generic partition and sort keys. Instead of fetching a customer from one table and their orders from another, you retrieve everything in a single request. This isn't just about performance; it fundamentally changes how you approach data modeling.

Core Principles:

  1. Access patterns first: Document every query before designing the schema
  2. Data locality: Store related data together using the same partition key
  3. Generic keys: Use PK and SK instead of entity-specific names
  4. Item collections: Group related items with shared partition keys
  5. Attribute overloading: Same attributes serve different purposes across entity types

When to Use Single-Table Design

Single-table design excels when you need to retrieve related data together. Here's what I've learned about when it works well:

Good Use Cases:

  • E-commerce systems where you fetch customers with their orders
  • Social platforms retrieving posts with comments and likes
  • Multi-tenant SaaS applications with tenant isolation
  • Content management systems with hierarchical data
  • IoT platforms collecting sensor readings by device

When to Avoid:

Working with teams has shown me that single-table design isn't always the right choice:

  • Team lacks DynamoDB expertise (the learning curve is real)
  • Simple CRUD applications with minimal relationships
  • Ad-hoc reporting and data warehouse scenarios
  • Strong consistency required across all entity types
  • Different entities have completely different access patterns

Rick Houlihan's 2024 update emphasizes: "What is accessed together should be stored together." Don't force unrelated data into a single table just to follow a pattern.

Partition Key and Sort Key Strategies

The foundation of single-table design lies in understanding how to structure your keys.

Partition Key Patterns

typescript
// Entity type prefix - most common patternPK: "CUSTOMER#123"PK: "ORDER#456"PK: "PRODUCT#789"
// Composite partition key for multi-tenantPK: "TENANT#acme#USER#123"
// High-cardinality user-specific keyPK: "USER#${userId}" // Each user gets unique partition

Anti-Pattern to Avoid:

typescript
// Low-cardinality status key - creates hot partitionsPK: "STATUS#active" // All active users in ONE partition// This will throttle when you hit 3,000 RCU limit per partition

Sort Key Patterns

Sort keys enable range queries and hierarchical organization:

typescript
// Hierarchical sort key - enables prefix queriesSK: "US#CA#SanFrancisco#94102"// Query: begins_with(SK, "US#CA") returns all California items
// Timestamp-based chronological orderingSK: "2024-01-15T10:30:00#EVENT#123"
// Version control patternSK: "v0_item123" // Current versionSK: "v1_item123" // Previous versionSK: "v2_item123" // Older version
// Composite for relationshipsSK: "ORDERITEM#PRODUCT#789#2024-01-15"

Best Practices:

  • Use high-cardinality partition keys (userId, orderId, productId)
  • Design sort keys to support range queries with begins_with and BETWEEN
  • Include timestamps for chronological ordering
  • Maintain consistent prefixes across entity types

Modeling Relationships

Let me show you how to model each relationship type with working examples.

One-to-One Relationships

Store related data in the same item collection with different sort keys:

typescript
interface User {  PK: string;  SK: string;  EntityType: "User";  email: string;  name: string;}
interface UserPreferences {  PK: string;  SK: string;  EntityType: "UserPreferences";  theme: "dark" | "light";  language: string;}
// Stored as:{  PK: "USER#123",  SK: "METADATA",  EntityType: "User",  email: "[email protected]",  name: "John Doe"}{  PK: "USER#123",  SK: "PREFERENCES",  EntityType: "UserPreferences",  theme: "dark",  language: "en"}
// Single query retrieves bothconst params = {  TableName: 'MainTable',  KeyConditionExpression: 'PK = :pk',  ExpressionAttributeValues: {    ':pk': 'USER#123'  }};

One-to-Many Relationships

Item collections make one-to-many relationships straightforward:

typescript
interface Customer {  PK: string;  SK: string;  EntityType: "Customer";  name: string;  email: string;}
interface Order {  PK: string; // Same as customer PK  SK: string; // Chronological sort key  EntityType: "Order";  orderId: string;  total: number;  status: string;  orderDate: string;}
// Customer{  PK: "CUSTOMER#123",  SK: "METADATA",  EntityType: "Customer",  name: "John Doe",  email: "[email protected]"}
// Orders for this customer{  PK: "CUSTOMER#123",  SK: "ORDER#2024-01-15#456",  EntityType: "Order",  orderId: "456",  total: 99.99,  status: "delivered",  orderDate: "2024-01-15"}{  PK: "CUSTOMER#123",  SK: "ORDER#2024-01-20#457",  EntityType: "Order",  orderId: "457",  total: 149.99,  status: "pending",  orderDate: "2024-01-20"}
// Get customer and ALL orders in one queryconst getCustomerWithOrders = async (customerId: string) => {  const result = await dynamodb.query({    TableName: 'MainTable',    KeyConditionExpression: 'PK = :pk',    ExpressionAttributeValues: {      ':pk': `CUSTOMER#${customerId}`    }  });
  return {    customer: result.Items?.find(item => item.SK === 'METADATA'),    orders: result.Items?.filter(item => item.SK.startsWith('ORDER#'))  };};
// Get only pending ordersconst getPendingOrders = async (customerId: string) => {  const result = await dynamodb.query({    TableName: 'MainTable',    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',    FilterExpression: '#status = :status',    ExpressionAttributeNames: {      '#status': 'status'    },    ExpressionAttributeValues: {      ':pk': `CUSTOMER#${customerId}`,      ':sk': 'ORDER#',      ':status': 'pending'    }  });
  return result.Items;};

Many-to-Many Relationships

Use the adjacency list pattern to model many-to-many relationships:

typescript
interface ProductCategory {  PK: string;  SK: string;  EntityType: "ProductCategory";  productName?: string;  categoryName?: string;}
// Product belongs to multiple categories// Forward relationships{  PK: "PRODUCT#789",  SK: "CATEGORY#Electronics",  EntityType: "ProductCategory",  categoryName: "Electronics"}{  PK: "PRODUCT#789",  SK: "CATEGORY#Gadgets",  EntityType: "ProductCategory",  categoryName: "Gadgets"}
// Reverse relationships (write both for bidirectional queries){  PK: "CATEGORY#Electronics",  SK: "PRODUCT#789",  EntityType: "CategoryProduct",  productName: "Wireless Headphones"}{  PK: "CATEGORY#Gadgets",  SK: "PRODUCT#789",  EntityType: "CategoryProduct",  productName: "Wireless Headphones"}
// Query all categories for a productconst getProductCategories = async (productId: string) => {  const result = await dynamodb.query({    TableName: 'MainTable',    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',    ExpressionAttributeValues: {      ':pk': `PRODUCT#${productId}`,      ':sk': 'CATEGORY#'    }  });
  return result.Items;};
// Query all products in a categoryconst getCategoryProducts = async (categoryName: string) => {  const result = await dynamodb.query({    TableName: 'MainTable',    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',    ExpressionAttributeValues: {      ':pk': `CATEGORY#${categoryName}`,      ':sk': 'PRODUCT#'    }  });
  return result.Items;};

Trade-off: Writing both directions doubles write operations but enables efficient queries in both directions without Scan operations.

Denormalization Pattern

Sometimes you need frequently accessed data without extra queries:

typescript
interface Order {  PK: string;  SK: string;  EntityType: "Order";  orderId: string;  customerId: string;  // Denormalized customer data  customerName: string;  customerEmail: string;  total: number;}
{  PK: "ORDER#456",  SK: "METADATA",  EntityType: "Order",  orderId: "456",  customerId: "123",  customerName: "John Doe",      // Copied from customer record  customerEmail: "[email protected]", // Copied from customer record  total: 99.99}

Trade-off Analysis:

  • Faster reads: No need to fetch customer details separately
  • More complex writes: Update customer name requires updating all their orders
  • Storage overhead: Customer data duplicated across orders
  • Eventual consistency: Updates to customer data require background job to update orders

Use denormalization when read frequency significantly exceeds write frequency and data rarely changes.

GSI vs LSI: Making the Right Choice

Understanding the differences between Global Secondary Indexes and Local Secondary Indexes is critical for efficient access patterns.

Local Secondary Index (LSI)

LSI shares the partition key with the base table but uses a different sort key:

typescript
// Table structureinterface Order {  PK: string;              // Partition key  SK: string;              // Sort key (order date)  orderId: string;  status: string;  total: number;}
// Base table queries by date{  PK: "CUSTOMER#123",  SK: "ORDER#2024-01-15#456",  orderId: "456",  status: "delivered",  total: 99.99}
// LSI enables querying by status with strong consistencyconst lsiDefinition = {  IndexName: "LSI-Status",  KeySchema: [    { AttributeName: "PK", KeyType: "HASH" },    // Same as table    { AttributeName: "status", KeyType: "RANGE" } // Different sort key  ],  Projection: {    ProjectionType: "ALL"  }};
// Query customer's pending orders with strong consistencyconst params = {  TableName: 'Orders',  IndexName: 'LSI-Status',  KeyConditionExpression: 'PK = :pk AND #status = :status',  ExpressionAttributeNames: {    '#status': 'status'  },  ExpressionAttributeValues: {    ':pk': 'CUSTOMER#123',    ':status': 'pending'  },  ConsistentRead: true // Only possible with LSI};

LSI Characteristics:

  • Must be defined at table creation (cannot add later)
  • Shares partition key with base table
  • Supports strongly consistent reads
  • Shares throughput capacity with base table
  • 10GB limit per partition key value
  • Maximum 5 LSIs per table
  • No additional capacity planning needed

Global Secondary Index (GSI)

GSI uses different partition and sort keys, enabling completely new access patterns:

typescript
// Table structureinterface Order {  PK: string;  SK: string;  EntityType: string;  orderId: string;  orderDate: string;  status: string;  GSI1PK: string;  // For date-based queries  GSI1SK: string;}
{  PK: "CUSTOMER#123",  SK: "ORDER#456",  EntityType: "Order",  orderId: "456",  orderDate: "2024-01-15",  status: "delivered",  GSI1PK: "2024-01-15",        // Group by date  GSI1SK: "ORDER#456"}
// GSI Definitionconst gsiDefinition = {  IndexName: "GSI1",  KeySchema: [    { AttributeName: "GSI1PK", KeyType: "HASH" },    { AttributeName: "GSI1SK", KeyType: "RANGE" }  ],  Projection: {    ProjectionType: "INCLUDE",    NonKeyAttributes: ["orderId", "status", "total"]  }};
// Query ALL orders by date (across all customers)const getOrdersByDate = async (date: string) => {  const result = await dynamodb.query({    TableName: 'MainTable',    IndexName: 'GSI1',    KeyConditionExpression: 'GSI1PK = :date',    ExpressionAttributeValues: {      ':date': date    }  });
  return result.Items;};

GSI Characteristics:

  • Can be added or removed after table creation
  • Different partition and sort keys from base table
  • Eventually consistent only (no strong consistency)
  • Independent throughput in provisioned mode
  • No size limits
  • Maximum 20 GSIs per table (increased from 5)
  • Enables cross-partition queries

Decision Matrix

RequirementLSIGSI
Strong consistency neededYesNo
Different partition key neededNoYes
Create after table existsNoYes
Same partition, different sort orderYesYes
Independent capacity planningNoYes (provisioned)
Cross-partition queriesNoYes
Item size > 10GB per partitionNoYes

When to Use LSI:

  • Strong consistency is required
  • Querying same partition with alternative sort order
  • Small datasets (< 10GB per partition)
  • Access patterns known at table creation time

When to Use GSI:

  • Need different partition key for access pattern
  • Cross-partition queries required
  • Adding new access patterns to existing table
  • Eventually consistent reads are acceptable
  • Large datasets

Sparse Index Pattern

Only items with GSI attributes are included in the index, reducing storage costs:

typescript
interface User {  PK: string;  SK: string;  EntityType: "User";  status: "active" | "inactive";  GSI1PK?: string; // Only set for active users  GSI1SK?: string;}
// Active user - indexed{  PK: "USER#123",  SK: "METADATA",  EntityType: "User",  status: "active",  email: "[email protected]",  GSI1PK: "ACTIVE_USERS",    // Included in GSI  GSI1SK: "USER#123"}
// Inactive user - NOT indexed{  PK: "USER#456",  SK: "METADATA",  EntityType: "User",  status: "inactive",  email: "[email protected]"  // No GSI1PK/GSI1SK - not in index, saves storage}
// Query only active usersconst getActiveUsers = async () => {  const result = await dynamodb.query({    TableName: 'MainTable',    IndexName: 'GSI1',    KeyConditionExpression: 'GSI1PK = :type',    ExpressionAttributeValues: {      ':type': 'ACTIVE_USERS'    }  });
  return result.Items;};

Cost Savings: If only 10% of users are active, sparse indexing reduces GSI storage by 90%.

DynamoDB Accelerator (DAX) Integration

DAX provides microsecond response times for read-heavy workloads through in-memory caching.

When to Use DAX

typescript
// Scenario 1: Read-heavy workload// E-commerce product catalog// - 95% reads, 5% writes// - Expected cache hit rate: 90%+// - Benefit: 10x latency improvement + reduced costs
// Scenario 2: Hot key pattern// Flash sale - single product receives 10,000 reads/second// Without DAX: Potential throttling, high costs// With DAX: Offloads reads from DynamoDB
// Scenario 3: Repeated reads// Analytics dashboard querying same data repeatedly// DAX caches entire working set in memory

When NOT to Use DAX

typescript
// Anti-pattern 1: Write-heavy workload// Real-time analytics with frequent updates// DAX adds overhead without benefit
// Anti-pattern 2: Strong consistency required// Financial transactions needing immediate consistency// DAX provides eventual consistency only
// Anti-pattern 3: Low cache hit rate// Ad-hoc queries with random access patterns// Cache hit rate < 50% means poor ROI
// Anti-pattern 4: Low traffic// Application with < 100 requests/second// DAX cost exceeds savings

DAX Implementation

typescript
import { DynamoDB } from '@aws-sdk/client-dynamodb';import { DynamoDBDocument } from '@aws-sdk/lib-dynamodb';import AmazonDaxClient from 'amazon-dax-client';
// Without DAX - direct DynamoDBconst directClient = new DynamoDB({  region: 'us-east-1'});const dynamodb = DynamoDBDocument.from(directClient);
// With DAXconst daxClient = new AmazonDaxClient({  endpoints: ['my-cluster.dax.us-east-1.amazonaws.com:8111'],  region: 'us-east-1'});const dax = DynamoDBDocument.from(daxClient);
// Same API - drop-in replacementconst getProduct = async (productId: string) => {  const params = {    TableName: 'Products',    Key: {      PK: `PRODUCT#${productId}`,      SK: 'METADATA'    }  };
  // First call: DynamoDB query (5ms)  // Subsequent calls: DAX cache (500μs)  const result = await dax.get(params);  return result.Item;};
// Query operations also cachedconst getProductReviews = async (productId: string) => {  const params = {    TableName: 'Products',    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',    ExpressionAttributeValues: {      ':pk': `PRODUCT#${productId}`,      ':sk': 'REVIEW#'    }  };
  const result = await dax.query(params);  return result.Items;};

DAX Performance Metrics

typescript
const benchmarks = {  dynamodb: {    getItem: '3-5ms',    query: '5-10ms',    cost: '$0.155 per million reads'  // Updated Nov 2024  },  dax: {    getItem: '200-500μs (cache hit)',    query: '300-700μs (cache hit)',    cacheMiss: '4-6ms (DynamoDB + overhead)',    cost: '$0.11/hour per t3.small node'  }};

Cost Analysis

typescript
// Scenario: 1,000 req/sec with 95% cache hit rate
// Without DAX:// - Requests: 2.59B/month// - DynamoDB cost: $401/month (on-demand at $0.155/million)
// With DAX (3-node t3.medium cluster):// - DAX cluster: $482/month// - Cache hits (95%): 2.46B reads (served by DAX)// - Cache misses (5%): 130M reads from DynamoDB// - DynamoDB cost: $20/month// - Total: $502/month// - Savings: Minimal at this scale; main benefit is 10x latency improvement
// Break-even point: ~500 req/sec with 90%+ cache hit rate// Primary value: Latency reduction, not just cost savings

Query Optimization Techniques

The difference between Query and Scan operations dramatically impacts performance and cost.

Query vs Scan

typescript
// ANTI-PATTERN: Scan operationconst findUserByEmail = async (email: string) => {  const result = await dynamodb.scan({    TableName: 'Users',    FilterExpression: 'email = :email',    ExpressionAttributeValues: {      ':email': email    }  });
  return result.Items?.[0];};// - Reads entire table, filters in application// - 1M items = 1M RCUs consumed// - Latency: 5-60 seconds// - Cost: $0.25 per scan
// BEST PRACTICE: Query with GSIconst findUserByEmailOptimized = async (email: string) => {  const result = await dynamodb.query({    TableName: 'Users',    IndexName: 'GSI-Email',    KeyConditionExpression: 'GSI1PK = :email',    ExpressionAttributeValues: {      ':email': `EMAIL#${email}`    }  });
  return result.Items?.[0];};// - Direct partition access// - 1 item = 0.5 RCU (eventually consistent)// - Latency: 5-10ms// - Cost: $0.00000025 per query (1,000x cheaper)

Real-World Impact

Working on systems with millions of users taught me the cost difference is dramatic:

typescript
// Finding user by email in 1M user table
// Scan approach:// - Read capacity: 1,000,000 RCUs// - Time: 45 seconds (with pagination)// - Cost per query: $0.25
// Query with GSI:// - Read capacity: 0.5 RCU// - Time: 8ms// - Cost per query: $0.00000025
// Result: 1,000x cost reduction, 5,000x faster

Projection Optimization

typescript
// ANTI-PATTERN: Project all attributesconst gsiAll = {  IndexName: 'GSI1',  Projection: {    ProjectionType: 'ALL' // Duplicates entire item  }};// Storage: 2x table size (table + full GSI copy)// Cost: High
// BEST PRACTICE: Project only needed attributesconst gsiInclude = {  IndexName: 'GSI1',  Projection: {    ProjectionType: 'INCLUDE',    NonKeyAttributes: ['name', 'email', 'status']  }};// Storage: Minimal (keys + specified attributes)// Cost: Optimized
// OPTIMAL: Keys only when fetching full item anywayconst gsiKeys = {  IndexName: 'GSI1',  Projection: {    ProjectionType: 'KEYS_ONLY'  }};// Use pattern: Query GSI for IDs, then BatchGetItem for details

Batch Operations

typescript
// ANTI-PATTERN: Sequential GetItem callsconst getUsersSequential = async (userIds: string[]) => {  const users = [];
  for (const userId of userIds) {    const result = await dynamodb.get({      TableName: 'Users',      Key: {        PK: `USER#${userId}`,        SK: 'METADATA'      }    });    users.push(result.Item);  }
  return users;};// 100 users = 100 round trips = 500-1000ms
// BEST PRACTICE: BatchGetItemimport { BatchGetCommand } from '@aws-sdk/lib-dynamodb';
const getUsersBatch = async (userIds: string[]) => {  const command = new BatchGetCommand({    RequestItems: {      Users: {        Keys: userIds.map(id => ({          PK: `USER#${id}`,          SK: 'METADATA'        }))      }    }  });
  const result = await dynamodb.send(command);  return result.Responses?.Users || [];};// 100 users = 1 request (up to 100 items) = 50-100ms// 5-10x faster

Composite Sort Key for Filtering

typescript
// ANTI-PATTERN: Query + FilterExpressionconst getPendingOrdersWrong = async (customerId: string) => {  const result = await dynamodb.query({    TableName: 'Orders',    KeyConditionExpression: 'PK = :pk',    FilterExpression: '#status = :status',    ExpressionAttributeNames: {      '#status': 'status'    },    ExpressionAttributeValues: {      ':pk': `CUSTOMER#${customerId}`,      ':status': 'pending'    }  });
  return result.Items;};// Reads ALL orders, filters after// Consumes RCUs for all orders, returns only pending
// BEST PRACTICE: Composite sort key// SK format: "STATUS#pending#ORDER#456"const getPendingOrdersOptimized = async (customerId: string) => {  const result = await dynamodb.query({    TableName: 'Orders',    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :status)',    ExpressionAttributeValues: {      ':pk': `CUSTOMER#${customerId}`,      ':status': 'STATUS#pending'    }  });
  return result.Items;};// Reads ONLY pending orders// Consumes RCUs only for matching items

Preventing Hot Partitions

Each DynamoDB partition supports 3,000 RCUs and 1,000 WCUs. Exceeding these limits causes throttling.

Hot Partition Scenarios

typescript
// ANTI-PATTERN 1: Low-cardinality partition key{  PK: "STATUS#active",  // Only 2-3 unique values  SK: "USER#123"}// All active users in ONE partition// Easily exceeds 3,000 RCU limit
// ANTI-PATTERN 2: Celebrity problem{  PK: "USER#celebrity",  SK: "FOLLOWER#456"}// Millions of followers in one partition// Exceeds 10GB and throughput limits
// ANTI-PATTERN 3: Time-based key without sharding{  PK: "2024-01-15",  // All today's data  SK: "EVENT#123"}// Hot partition during peak hours

Prevention Strategy 1: Write Sharding

typescript
// Add random suffix to distribute writesconst SHARD_COUNT = 10;
const writeWithSharding = async (userId: string, data: any) => {  const shardId = Math.floor(Math.random() * SHARD_COUNT);
  await dynamodb.put({    TableName: 'Users',    Item: {      PK: `STATUS#active#${shardId}`, // Distributes across 10 partitions      SK: `USER#${userId}`,      ...data    }  });};
// Reading requires querying all shardsconst getActiveUsers = async () => {  const promises = [];
  for (let i = 0; i < SHARD_COUNT; i++) {    promises.push(      dynamodb.query({        TableName: 'Users',        KeyConditionExpression: 'PK = :pk',        ExpressionAttributeValues: {          ':pk': `STATUS#active#${i}`        }      })    );  }
  const results = await Promise.all(promises);  return results.flatMap(r => r.Items || []);};
// Distributes writes across 10 partitions// Throughput: 10,000 WCUs (1,000 * 10)

Prevention Strategy 2: Composite High-Cardinality Keys

typescript
// Combine low-cardinality with high-cardinality{  PK: `STATUS#active#USER#${userId}`, // Unique per user  SK: "METADATA"}
// Or use GSI for low-cardinality queries{  PK: `USER#${userId}`,  SK: "METADATA",  status: "active",  GSI1PK: "STATUS#active",  // GSI for status queries  GSI1SK: `USER#${userId}`}

Prevention Strategy 3: Deterministic Sharding

typescript
import crypto from 'crypto';
const getShardId = (entityId: string, shardCount: number): number => {  const hash = crypto.createHash('md5').update(entityId).digest('hex');  return parseInt(hash.substring(0, 8), 16) % shardCount;};
const shardId = getShardId(userId, 10);const partitionKey = `USERS#${shardId}`;

This approach ensures the same entity always goes to the same shard, enabling consistent reads without querying all shards.

Cost Optimization Strategies

Provisioned vs On-Demand

typescript
const pricing = {  provisioned: {    rcu: '$0.00013 per hour',    wcu: '$0.00065 per hour',    storage: '$0.25 per GB-month'  },  onDemand: {    readRequest: '$0.155 per million reads',  // Updated Nov 2024    writeRequest: '$0.78 per million writes',  // Updated Nov 2024    storage: '$0.25 per GB-month'  }};
// Example: 100M reads/month steady traffic// Provisioned: ~38.5 RCU * $0.00013 * 730 hours = $3.65/month// On-Demand: 100M / 1M * $0.155 = $15.50/month// Difference: 4.25x more expensive with on-demand

When to Use Provisioned:

  • Predictable traffic patterns
  • High volume (>1M requests/day)
  • 24/7 production applications
  • Budget-conscious scenarios
  • Can commit to reserved capacity (54% 1-year or 77% 3-year savings)

When to Use On-Demand:

  • Unpredictable traffic
  • New applications with unknown load
  • Development/testing environments
  • Spiky workloads (10x variance)
  • Small-scale applications (less than 1M requests/day)

Sparse Index Savings

typescript
// Without sparse index: All 1M users indexed// Table: 10GB// GSI with ALL projection: +10GB// Total: 20GB * $0.25 = $5/month storage
// With sparse index: Only 100K active users indexed// Table: 10GB// GSI with sparse index: +1GB (10% of users)// Total: 11GB * $0.25 = $2.75/month storage// Savings: 45%

Single-Table Cost Benefits

typescript
// Multi-table approach:// - Users: 5 RCU, 5 WCU// - Orders: 10 RCU, 10 WCU// - Products: 15 RCU, 5 WCU// - OrderItems: 20 RCU, 20 WCU// Total: 50 RCU, 40 WCU = $28.47/month
// Single-table approach:// - MainTable: 30 RCU, 25 WCU// Total: $15.69/month// Savings: 45% + simplified management

Common Pitfalls and Solutions

Pitfall 1: Not Documenting Access Patterns First

Experience has shown that designing tables before understanding queries leads to redesigns:

typescript
// WRONG approach:// 1. Create tables based on entities// 2. Add GSIs when queries don't work// 3. End up with 5+ GSIs, still inefficient
// RIGHT approach:const accessPatterns = [  'Get customer and all their orders',  'Get order and all its items',  'Get all orders by date range',  'Get customer by email',  'Get product and all reviews',  'Get customer reviews'];
// Design table and indexes to support ALL patterns efficiently

Pitfall 2: Missing Error Handling for Throttling

typescript
// Production issue: No retry logic during traffic spike// Result: 50% error rate
// Solution: Exponential backoffimport { DynamoDBClient } from '@aws-sdk/client-dynamodb';
const client = new DynamoDBClient({  region: 'us-east-1',  maxAttempts: 3,  retryMode: 'adaptive' // Built-in exponential backoff});
// Better: Implement circuit breaker pattern// Best: Design to avoid hot partitions

Pitfall 3: Ignoring Item Size Limits

typescript
// Problem: 400KB limit per item// Learned: Storing 500 line items in one order exceeded limit
// Solution 1: Pagination pattern{  PK: "ORDER#456",  SK: "ITEMS#PAGE#1",  items: [...] // First 100 items}{  PK: "ORDER#456",  SK: "ITEMS#PAGE#2",  items: [...] // Next 100 items}
// Solution 2: Individual items (preferred){  PK: "ORDER#456",  SK: "ITEM#1",  productId: "789",  quantity: 2}// Each line item as separate DynamoDB item

Pitfall 4: Not Planning LSIs at Table Creation

You cannot add LSIs after table creation. This requires data migration if you need them later:

typescript
// Plan LSIs upfront, even if not immediately neededconst tableDefinition = {  TableName: 'Orders',  KeySchema: [    { AttributeName: 'PK', KeyType: 'HASH' },    { AttributeName: 'SK', KeyType: 'RANGE' }  ],  LocalSecondaryIndexes: [    {      IndexName: 'LSI-Status',      KeySchema: [        { AttributeName: 'PK', KeyType: 'HASH' },        { AttributeName: 'status', KeyType: 'RANGE' }      ],      Projection: { ProjectionType: 'ALL' }    }  ]};
// GSIs can be added later, LSIs cannot

Pitfall 5: Incorrect Capacity Mode

typescript
// Started production with on-demand// Traffic: Steady 50 read/sec + 20 write/sec, 24/7// Monthly cost: ~$45/month (AWS Calculator)// 129.6M reads + 51.84M writes
// Switched to provisioned with auto-scaling// 100 RCU + 40 WCU (eventually consistent reads)// Monthly cost: ~$28/month// Savings: ~38% ($17/month)
// Lesson: Provisioned capacity more cost-effective for predictable traffic// On-demand preferred for variable spiky workloads

When NOT to Use Single-Table Design

Single-table design isn't always the right choice. Here's when to avoid it:

Use Separate Tables When:

  • Different entity types have vastly different consistency requirements
  • Team lacks DynamoDB expertise (learning curve impacts velocity)
  • Simple CRUD with minimal relationships (overhead not justified)
  • Ad-hoc reporting needs (data warehouse patterns better suited)
  • Service boundaries in microservices (separate tables per service)
  • Different entities have no access pattern overlap

Rick Houlihan's 2024 guidance: "Don't mix configuration and operational data in single table. Don't maintain single table across service boundaries."

Key Takeaways

Here's what working with DynamoDB single-table design has taught me:

  1. Access patterns first: Document all queries before designing schema
  2. Data locality: Store related data together with same partition key
  3. Query not Scan: Always design for Query operations (100-1000x cheaper)
  4. GSI vs LSI: GSIs for flexibility, LSIs for strong consistency
  5. Hot partitions: Use high-cardinality partition keys, implement sharding when needed
  6. DAX ROI: Break-even around 300 req/sec with 90%+ cache hit rate
  7. Cost optimization: Provisioned mode for steady traffic (6-7x cheaper)
  8. Sparse indexes: Save 50%+ on storage by indexing subsets
  9. Projection optimization: Use INCLUDE or KEYS_ONLY instead of ALL
  10. Know limits: 400KB per item, 10GB per LSI partition, 3,000 RCU per partition

Type-Safe Implementation Libraries

When implementing single-table design patterns with TypeScript, using type-safe libraries instead of raw AWS SDK increases development velocity and prevents runtime errors. Two popular choices:

DynamoDB Toolbox

DynamoDB Toolbox is a modern TypeScript library compatible with AWS SDK v3:

  • Type Safety: Automatic TypeScript types from entity definitions
  • Schema Validation: Runtime data validation
  • Query Builder: Type-safe query and update expressions
  • Single-Table Support: Built-in support for GSI and composite key patterns
  • AWS SDK v3: Full compatibility with the latest SDK version

Instead of complex AttributeValue mappings with raw AWS SDK, you can use clean, maintainable entity definitions. Check out the DynamoDB Toolbox guide for detailed implementation examples and production best practices.

OneTable

OneTable is an alternative library specifically designed for single-table design:

  • Schema-Driven: Model definition with JSON schema
  • Migration Support: Built-in schema migration support
  • TypeScript Generation: Automatic type generation from schemas
  • Developer Experience: Minimal boilerplate, intuitive API
  • Validation: Robust validation with JSON Schema standard

OneTable provides powerful tools for schema evolution and migration needs, especially in large and complex single-table designs.

Which to Choose?

Choose DynamoDB Toolbox if:

  • You're migrating to AWS SDK v3 or starting fresh
  • You want tighter integration with AWS ecosystem
  • You'll use more AWS-native patterns
  • Detailed guide available on this site

Choose OneTable if:

  • You perform schema migrations frequently
  • You prefer JSON Schema standard
  • You want more abstraction and convention over configuration
  • You're doing rapid prototyping

Both libraries are production-ready and actively maintained. The choice depends on team preference and project requirements.

Working with DynamoDB connects to several other areas worth exploring:

Single-table design represents a paradigm shift from relational thinking. The key is understanding your access patterns first, designing keys to support efficient queries, and choosing the right indexes for your workload. Start with simpler patterns, measure performance, and evolve your design based on actual usage.

Related Posts