Ayhan Sipahi 2025-09-05

Build a URL Shortener with AWS CDK Part 4: Production Deployment

Multi-environment deployment, performance optimization at scale, cost management, and monitoring with solid incident response patterns.

Production Deployment & Optimization

Production optimization requires more than making things fast - it demands predictable performance under any load condition. When traffic spikes unexpectedly, infrastructure that works perfectly in staging can reveal scaling bottlenecks in production.

The most common oversight? Database provisioning for steady-state traffic rather than peak loads. A DynamoDB table optimized for normal operations can become a bottleneck when traffic increases 10x during campaigns or product launches.

In Parts 1-3, we built the foundation, core functionality, and security. Now this post covers making it production-ready: deployment, monitoring, and optimization.

Multi-Environment Deployment: Beyond Dev and Prod

Most tutorials show you dev and prod environments. In practice, you need at least four: dev, staging, pre-prod, and production. Here’s why and how to build them:

// bin/link-shortener.ts - The app entry point that got us through launch day
#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from 'aws-cdk-lib';
import { LinkShortenerStack } from '../lib/link-shortener-stack';
import { DatabaseStack } from '../lib/database-stack';
import { MonitoringStack } from '../lib/monitoring-stack';

const app = new cdk.App();

// Environment configuration that scales with your team
const environments = {
  dev: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: 'us-west-2', // Cheaper region for dev
    stage: 'dev',
    domain: 'dev-links.yourcompany.com',
    customDomain: false,
    monitoring: {
      detailedMetrics: false,
      logRetention: 7, // Days
      alerting: false,
    },
    database: {
      billingMode: 'PAY_PER_REQUEST',
      pointInTimeRecovery: false,
      backupRetention: 7,
    },
    lambda: {
      reservedConcurrency: 10, // Limit dev costs
      memorySize: 512,
      timeout: 30,
    }
  },
  
  staging: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: 'us-east-1',
    stage: 'staging',
    domain: 'staging-links.yourcompany.com',
    customDomain: true,
    monitoring: {
      detailedMetrics: true,
      logRetention: 14,
      alerting: true,
    },
    database: {
      billingMode: 'PAY_PER_REQUEST',
      pointInTimeRecovery: true,
      backupRetention: 14,
    },
    lambda: {
      reservedConcurrency: 50,
      memorySize: 1024,
      timeout: 30,
    }
  },

  'pre-prod': {
    account: process.env.CDK_PREPROD_ACCOUNT,
    region: 'us-east-1',
    stage: 'pre-prod',
    domain: 'pp-links.yourcompany.com',
    customDomain: true,
    monitoring: {
      detailedMetrics: true,
      logRetention: 30,
      alerting: true,
    },
    database: {
      billingMode: 'PROVISIONED', // Match production patterns
      readCapacity: 100,
      writeCapacity: 50,
      pointInTimeRecovery: true,
      backupRetention: 30,
    },
    lambda: {
      reservedConcurrency: 200,
      memorySize: 1024,
      timeout: 30,
    }
  },

  production: {
    account: process.env.CDK_PROD_ACCOUNT,
    region: 'us-east-1',
    stage: 'prod',
    domain: 'go.yourcompany.com',
    customDomain: true,
    monitoring: {
      detailedMetrics: true,
      logRetention: 90,
      alerting: true,
      dashboard: true,
    },
    database: {
      billingMode: 'PROVISIONED',
      readCapacity: 500, // Start conservative, auto-scale up
      writeCapacity: 200,
      pointInTimeRecovery: true,
      backupRetention: 90,
      globalTables: true, // Multi-region disaster recovery
    },
    lambda: {
      reservedConcurrency: 1000,
      memorySize: 1024,
      timeout: 30,
      provisionedConcurrency: 10, // Keep some functions warm
    }
  }
};

const stage = app.node.tryGetContext('stage') || 'dev';
const config = environments[stage as keyof typeof environments];

if (!config) {
  throw new Error(`Invalid stage: ${stage}. Available stages: ${Object.keys(environments).join(', ')}`);
}

// Deploy in logical order with dependencies
const databaseStack = new DatabaseStack(app, `LinkShortener-Database-${stage}`, {
  env: { account: config.account, region: config.region },
  stage,
  config: config.database,
});

const appStack = new LinkShortenerStack(app, `LinkShortener-App-${stage}`, {
  env: { account: config.account, region: config.region },
  stage,
  config,
  database: databaseStack.database,
});

// Only deploy monitoring in staging+ environments
if (stage !== 'dev') {
  new MonitoringStack(app, `LinkShortener-Monitoring-${stage}`, {
    env: { account: config.account, region: config.region },
    stage,
    config: config.monitoring,
    appStack,
  });
}

Why four environments? Each serves a specific purpose:

Dev: Development isolation with cost controls for experimentation
Staging: Integration testing with production-like data and configurations
Pre-prod: Production replica for load testing and final validation
Production: Live environment with full monitoring and redundancy

Performance Optimization: Lambda Cold Starts and Beyond

Here are the optimizations that actually made a difference:

1. Lambda Configuration That Matters

// lib/constructs/optimized-lambda.ts
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as nodejs from 'aws-cdk-lib/aws-lambda-nodejs';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';

export interface OptimizedLambdaProps {
  entry: string;
  stage: string;
  reservedConcurrency?: number;
  provisionedConcurrency?: number;
  memorySize?: number;
}

export class OptimizedLambda extends Construct {
  public readonly function: nodejs.NodejsFunction;

  constructor(scope: Construct, id: string, props: OptimizedLambdaProps) {
    super(scope, id);

    this.function = new nodejs.NodejsFunction(this, 'Function', {
      entry: props.entry,
      handler: 'handler',
      runtime: lambda.Runtime.NODEJS_20_X,
      
      // Memory configuration affects CPU - sweet spot for most workloads
      memorySize: props.memorySize || 1024,
      
      // Timeout aggressive enough to fail fast
      timeout: cdk.Duration.seconds(30),
      
      // Environment variables for optimization
      environment: {
        NODE_OPTIONS: '--enable-source-maps',
        AWS_NODEJS_CONNECTION_REUSE_ENABLED: '1', // Reuse TCP connections
        POWERTOOLS_SERVICE_NAME: 'link-shortener',
        POWERTOOLS_METRICS_NAMESPACE: 'LinkShortener',
      },

      // Bundle optimization
      bundling: {
        minify: true,
        sourceMap: true,
        target: 'es2022',
        format: nodejs.OutputFormat.ESM,
        banner: 'import { createRequire } from "module"; const require = createRequire(import.meta.url);',
        externalModules: [
          '@aws-sdk/*', // Don't bundle AWS SDK
        ],
        esbuildArgs: {
          '--tree-shaking': 'true',
          '--platform': 'node',
          '--target': 'node20',
        },
      },

      // Reserved concurrency to prevent one function from eating all capacity
      reservedConcurrency: props.reservedConcurrency,

      // VPC configuration only if you need it (adds 1-2s to cold starts)
      // vpc: props.stage === 'prod' ? vpc : undefined,
    });

    // Provisioned concurrency for production critical paths
    if (props.provisionedConcurrency && props.stage === 'prod') {
      const version = this.function.currentVersion;
      
      new lambda.Alias(this, 'ProductionAlias', {
        aliasName: 'prod',
        version,
        provisionedConcurrencyConfig: {
          provisionedConcurrentExecutions: props.provisionedConcurrency,
        },
      });
    }

    // X-Ray tracing for performance insights
    this.function.addEnvironment('_X_AMZN_TRACE_ID', '${_X_AMZN_TRACE_ID}');
  }
}

2. Connection Pooling That Actually Works

Creating new DynamoDB connections on every invocation was a major performance bottleneck. Here’s a connection manager that helps:

// src/utils/dynamodb-connection.ts
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';

// Global connection pool - survives between Lambda invocations
let dynamoClient: DynamoDBDocumentClient | null = null;

export function getDynamoClient(): DynamoDBDocumentClient {
  if (!dynamoClient) {
    const client = new DynamoDBClient({
      region: process.env.AWS_REGION,
      
      // Connection pooling configuration
      maxAttempts: 3,
      requestHandler: {
        // Optimize for Lambda runtime
        connectionTimeout: 1000, // 1s timeout
        requestTimeout: 5000,  // 5s total request timeout
        
        // Connection pooling
        httpsAgent: {
          maxSockets: 10,  // Reduced from default 50
          keepAlive: true,
          keepAliveMsecs: 30000,
        },
      },
      
      // Client-side caching of credentials
      credentials: {
        accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
        secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
        sessionToken: process.env.AWS_SESSION_TOKEN,
      },
    });

    dynamoClient = DynamoDBDocumentClient.from(client, {
      marshallOptions: {
        convertEmptyValues: false,
        removeUndefinedValues: true,
        convertClassInstanceToMap: false,
      },
      unmarshallOptions: {
        wrapNumbers: false,
      },
    });

    // Log connection creation for debugging
    console.log('DynamoDB connection pool initialized');
  }

  return dynamoClient;
}

// Performance monitoring wrapper
export async function withPerformanceLogging<T>(
  operation: string,
  fn: () => Promise<T>
): Promise<T> {
  const start = Date.now();
  
  try {
    const result = await fn();
    const duration = Date.now() - start;
    
    console.log(JSON.stringify({
      operation,
      duration,
      success: true,
      timestamp: new Date().toISOString(),
    }));
    
    return result;
  } catch (error) {
    const duration = Date.now() - start;
    
    console.error(JSON.stringify({
      operation,
      duration,
      success: false,
      error: error instanceof Error ? error.message : String(error),
      timestamp: new Date().toISOString(),
    }));
    
    throw error;
  }
}

3. Production-Optimized Redirect Handler

// src/handlers/redirect-optimized.ts
import { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda';
import { getDynamoClient, withPerformanceLogging } from '../utils/dynamodb-connection';
import { GetCommand, UpdateCommand } from '@aws-sdk/lib-dynamodb';

// Declare cold start tracking outside handler
let isColdStart = true;

export const handler = async (
  event: APIGatewayProxyEvent
): Promise<APIGatewayProxyResult> => {
  const startTime = Date.now();
  const coldStart = isColdStart;
  isColdStart = false;

  // Extract short code from path
  const shortCode = event.pathParameters?.proxy || event.pathParameters?.shortCode;
  
  if (!shortCode) {
    return createErrorResponse(404, 'Short code not found');
  }

  try {
    const dynamodb = getDynamoClient();
    
    // Optimized DynamoDB query with projection
    const result = await withPerformanceLogging(
      'GetShortUrl',
      () => dynamodb.send(new GetCommand({
        TableName: process.env.URLS_TABLE_NAME!,
        Key: { shortCode },
        ProjectionExpression: 'originalUrl, expiresAt, clickCount',
        ConsistentRead: false, // Eventually consistent is fine for redirects
      }))
    );

    if (!result.Item) {
      // Log 404 for analytics but don't block
      logAnalyticsAsync('404', shortCode, event).catch(console.error);
      return createErrorResponse(404, 'Link not found');
    }

    const { originalUrl, expiresAt } = result.Item;
    
    // Check expiration
    if (expiresAt && Date.now() > expiresAt) {
      logAnalyticsAsync('EXPIRED', shortCode, event).catch(console.error);
      return createErrorResponse(410, 'Link has expired');
    }

    // Update click count asynchronously (fire-and-forget)
    updateClickCountAsync(shortCode).catch(console.error);
    
    // Log successful redirect
    logAnalyticsAsync('SUCCESS', shortCode, event).catch(console.error);

    const responseTime = Date.now() - startTime;

    // Structured logging for monitoring
    console.log(JSON.stringify({
      event: 'redirect_success',
      shortCode,
      responseTime,
      coldStart,
      userAgent: event.headers['User-Agent']?.substring(0, 100),
      referer: event.headers['Referer']?.substring(0, 100),
      timestamp: new Date().toISOString(),
    }));

    return {
      statusCode: 301, // Permanent redirect for caching
      headers: {
        Location: originalUrl,
        'Cache-Control': 'public, max-age=300, s-maxage=3600', // 5min browser, 1hr CDN
        'X-Response-Time': responseTime.toString(),
        'X-Cold-Start': coldStart.toString(),
      },
      body: '',
    };

  } catch (error) {
    const responseTime = Date.now() - startTime;
    
    console.error(JSON.stringify({
      event: 'redirect_error',
      shortCode,
      error: error instanceof Error ? error.message : String(error),
      responseTime,
      coldStart,
      timestamp: new Date().toISOString(),
    }));

    return createErrorResponse(500, 'Internal server error');
  }
};

async function updateClickCountAsync(shortCode: string): Promise<void> {
  try {
    const dynamodb = getDynamoClient();
    
    await dynamodb.send(new UpdateCommand({
      TableName: process.env.URLS_TABLE_NAME!,
      Key: { shortCode },
      UpdateExpression: 'ADD clickCount :inc SET lastClickAt = :timestamp',
      ExpressionAttributeValues: {
        ':inc': 1,
        ':timestamp': Date.now(),
      },
    }));
  } catch (error) {
    // Don't fail redirect if analytics update fails
    console.error('Failed to update click count:', error);
  }
}

async function logAnalyticsAsync(
  eventType: string,
  shortCode: string,
  event: APIGatewayProxyEvent
): Promise<void> {
  // Implementation for async analytics logging
  // This would typically write to a separate analytics table or queue
}

function createErrorResponse(statusCode: number, message: string): APIGatewayProxyResult {
  return {
    statusCode,
    headers: {
      'Content-Type': 'text/html',
      'Cache-Control': 'no-cache',
    },
    body: `
      <!DOCTYPE html>
      <html>
        <head><title>Error</title></head>
        <body style="font-family: Arial, sans-serif; text-align: center; margin-top: 100px;">
          <h1>${statusCode}</h1>
          <p>${message}</p>
        </body>
      </html>
    `,
  };
}

Cost Optimization: Learning from Expensive Mistakes

Cost optimization becomes critical when traffic patterns change unexpectedly. Understanding how different AWS services scale and bill helps prevent budget surprises during high-traffic periods:

1. DynamoDB Optimization Strategy

// lib/database-stack-optimized.ts
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as applicationautoscaling from 'aws-cdk-lib/aws-applicationautoscaling';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';

export class OptimizedDatabaseStack extends Construct {
  public readonly linksTable: dynamodb.Table;

  constructor(scope: Construct, id: string, props: {
    stage: string;
    expectedReadsPerSecond: number;
    expectedWritesPerSecond: number;
  }) {
    super(scope, id);

    this.linksTable = new dynamodb.Table(this, 'LinksTable', {
      partitionKey: {
        name: 'shortCode',
        type: dynamodb.AttributeType.STRING,
      },
      
      // Start with on-demand, switch to provisioned when you understand patterns
      billingMode: props.stage === 'prod' 
        ? dynamodb.BillingMode.PROVISIONED 
        : dynamodb.BillingMode.PAY_PER_REQUEST,
      
      // Provisioned capacity for production
      ...(props.stage === 'prod' && {
        readCapacity: Math.max(5, Math.ceil(props.expectedReadsPerSecond * 1.2)),
        writeCapacity: Math.max(5, Math.ceil(props.expectedWritesPerSecond * 1.2)),
      }),

      pointInTimeRecovery: props.stage === 'prod',
      deletionProtection: props.stage === 'prod',
      
      // Encryption for compliance
      encryption: dynamodb.TableEncryption.AWS_MANAGED,
      
      stream: dynamodb.StreamViewType.NEW_AND_OLD_IMAGES, // For analytics
    });

    // Auto-scaling for production
    if (props.stage === 'prod') {
      this.setupAutoScaling();
    }

    // Global Secondary Index for analytics queries
    this.linksTable.addGlobalSecondaryIndex({
      indexName: 'UserIndex',
      partitionKey: {
        name: 'userId',
        type: dynamodb.AttributeType.STRING,
      },
      sortKey: {
        name: 'createdAt',
        type: dynamodb.AttributeType.NUMBER,
      },
      projectionType: dynamodb.ProjectionType.KEYS_ONLY, // Minimize costs
      
      // Same billing mode as main table
      ...(props.stage === 'prod' && {
        readCapacity: Math.max(5, Math.ceil(props.expectedReadsPerSecond * 0.1)),
        writeCapacity: Math.max(5, Math.ceil(props.expectedWritesPerSecond * 1.0)),
      }),
    });
  }

  private setupAutoScaling(): void {
    // Read capacity auto-scaling
    const readScaling = this.linksTable.autoScaleReadCapacity({
      minCapacity: 5,
      maxCapacity: 1000, // Reasonable ceiling
    });

    readScaling.scaleOnUtilization({
      targetUtilizationPercent: 70, // Conservative target
      scaleInCooldown: cdk.Duration.minutes(5),
      scaleOutCooldown: cdk.Duration.minutes(1),
    });

    // Write capacity auto-scaling
    const writeScaling = this.linksTable.autoScaleWriteCapacity({
      minCapacity: 5,
      maxCapacity: 500,
    });

    writeScaling.scaleOnUtilization({
      targetUtilizationPercent: 70,
      scaleInCooldown: cdk.Duration.minutes(5),
      scaleOutCooldown: cdk.Duration.minutes(1),
    });
  }
}

2. CloudFront Configuration for Maximum Cost Efficiency

// lib/cdn-stack-optimized.ts
import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';

export class OptimizedCDNStack extends Construct {
  public readonly distribution: cloudfront.Distribution;

  constructor(scope: Construct, id: string, props: {
    apiGateway: apigateway.RestApi;
    stage: string;
  }) {
    super(scope, id);

    this.distribution = new cloudfront.Distribution(this, 'Distribution', {
      defaultBehavior: {
        origin: new origins.RestApiOrigin(props.apiGateway),
        
        // Caching policy optimized for redirects
        cachePolicy: new cloudfront.CachePolicy(this, 'RedirectCachePolicy', {
          cachePolicyName: `link-shortener-${props.stage}`,
          defaultTtl: cdk.Duration.minutes(5),
          maxTtl: cdk.Duration.hours(24),
          minTtl: cdk.Duration.minutes(1),
          
          // Cache based on path only (ignore query strings and headers)
          queryStringBehavior: cloudfront.CacheQueryStringBehavior.none(),
          headerBehavior: cloudfront.CacheHeaderBehavior.none(),
          cookieBehavior: cloudfront.CacheCookieBehavior.none(),
        }),

        // Compression saves bandwidth costs
        compress: true,
        
        // Only allow GET requests for redirects
        allowedMethods: cloudfront.AllowedMethods.ALLOW_GET_HEAD,
        cachedMethods: cloudfront.CachedMethods.CACHE_GET_HEAD,

        viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
      },

      // Additional behavior for API endpoints (no caching)
      additionalBehaviors: {
        '/api/*': {
          origin: new origins.RestApiOrigin(props.apiGateway),
          cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
          viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
          allowedMethods: cloudfront.AllowedMethods.ALLOW_ALL,
        },
      },

      // Use cheapest price class for non-critical applications
      priceClass: props.stage === 'prod' 
        ? cloudfront.PriceClass.PRICE_CLASS_100  // US, Canada, Europe
        : cloudfront.PriceClass.PRICE_CLASS_100,

      // Error handling
      errorResponses: [
        {
          httpStatus: 404,
          responseHttpStatus: 404,
          responsePagePath: '/404.html',
          ttl: cdk.Duration.minutes(5), // Cache 404s to prevent hammering origin
        },
        {
          httpStatus: 500,
          responseHttpStatus: 500,
          responsePagePath: '/500.html',
          ttl: cdk.Duration.minutes(1), // Short cache for server errors
        },
      ],

      // Enable logging for analytics (additional cost but necessary for insights)
      ...(props.stage === 'prod' && {
        enableLogging: true,
        logBucket: s3.Bucket.fromBucketName(this, 'LogsBucket', `cloudfront-logs-${props.stage}`),
        logFilePrefix: 'link-shortener/',
      }),
    });
  }
}

3. Cost Monitoring and Alerts

// lib/cost-monitoring-stack.ts
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';
import * as sns from 'aws-cdk-lib/aws-sns';
import * as subscriptions from 'aws-cdk-lib/aws-sns-subscriptions';
import * as actions from 'aws-cdk-lib/aws-cloudwatch-actions';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';

export class CostMonitoringStack extends Construct {
  constructor(scope: Construct, id: string, props: {
    stage: string;
    alertEmail: string;
    monthlyBudget: number;
  }) {
    super(scope, id);

    // SNS topic for cost alerts
    const alertTopic = new sns.Topic(this, 'CostAlerts', {
      displayName: `Link Shortener Cost Alerts - ${props.stage}`,
    });

    alertTopic.addSubscription(
      new subscriptions.EmailSubscription(props.alertEmail)
    );

    // DynamoDB cost monitoring
    const dynamoReadAlarm = new cloudwatch.Alarm(this, 'DynamoReadUnitsHigh', {
      metric: new cloudwatch.Metric({
        namespace: 'AWS/DynamoDB',
        metricName: 'ConsumedReadCapacityUnits',
        dimensionsMap: {
          TableName: 'LinksTable', // Replace with actual table name
        },
        statistic: 'Sum',
        period: cdk.Duration.minutes(5),
      }),
      threshold: 1000, // Adjust based on your budget
      evaluationPeriods: 2,
      alarmDescription: 'DynamoDB read capacity usage is high',
    });

    dynamoReadAlarm.addAlarmAction(
      new actions.SnsAction(alertTopic)
    );

    // Lambda invocation cost monitoring
    const lambdaInvocationsAlarm = new cloudwatch.Alarm(this, 'LambdaInvocationsHigh', {
      metric: new cloudwatch.Metric({
        namespace: 'AWS/Lambda',
        metricName: 'Invocations',
        dimensionsMap: {
          FunctionName: 'redirect-handler', // Replace with actual function name
        },
        statistic: 'Sum',
        period: cdk.Duration.hours(1),
      }),
      threshold: 100000, // 100k invocations per hour
      evaluationPeriods: 1,
      alarmDescription: 'Lambda invocations are unusually high',
    });

    lambdaInvocationsAlarm.addAlarmAction(
      new actions.SnsAction(alertTopic)
    );

    // Create cost dashboard
    new cloudwatch.Dashboard(this, 'CostDashboard', {
      dashboardName: `LinkShortener-Costs-${props.stage}`,
      widgets: [
        [
          new cloudwatch.GraphWidget({
            title: 'DynamoDB Read Capacity Units',
            left: [dynamoReadAlarm.metric],
            width: 12,
          }),
        ],
        [
          new cloudwatch.GraphWidget({
            title: 'Lambda Invocations',
            left: [lambdaInvocationsAlarm.metric],
            width: 12,
          }),
        ],
        [
          new cloudwatch.GraphWidget({
            title: 'CloudFront Requests',
            left: [
              new cloudwatch.Metric({
                namespace: 'AWS/CloudFront',
                metricName: 'Requests',
                statistic: 'Sum',
                period: cdk.Duration.hours(1),
              }),
            ],
            width: 12,
          }),
        ],
      ],
    });
  }
}

Production Monitoring: Beyond “It Works”

Here’s the monitoring approach that helped during production incidents:

1. Custom Metrics That Matter

// src/utils/metrics.ts
import { CloudWatchClient, PutMetricDataCommand } from '@aws-sdk/client-cloudwatch';

const cloudwatch = new CloudWatchClient({ region: process.env.AWS_REGION });

export class MetricsCollector {
  private namespace = 'LinkShortener/Production';
  private metrics: Array<{
    MetricName: string;
    Value: number;
    Unit: string;
    Timestamp: Date;
    Dimensions?: Array<{ Name: string; Value: string }>;
  }> = [];

  async recordRedirectSuccess(shortCode: string, responseTime: number, coldStart: boolean): Promise<void> {
    this.metrics.push(
      {
        MetricName: 'RedirectResponseTime',
        Value: responseTime,
        Unit: 'Milliseconds',
        Timestamp: new Date(),
        Dimensions: [
          { Name: 'ColdStart', Value: coldStart.toString() },
        ],
      },
      {
        MetricName: 'RedirectCount',
        Value: 1,
        Unit: 'Count',
        Timestamp: new Date(),
        Dimensions: [
          { Name: 'Status', Value: 'Success' },
        ],
      }
    );

    await this.flush();
  }

  async recordDatabaseLatency(operation: string, latency: number): Promise<void> {
    this.metrics.push({
      MetricName: 'DatabaseLatency',
      Value: latency,
      Unit: 'Milliseconds',
      Timestamp: new Date(),
      Dimensions: [
        { Name: 'Operation', Value: operation },
      ],
    });

    await this.flush();
  }

  async recordError(errorType: string, shortCode?: string): Promise<void> {
    this.metrics.push({
      MetricName: 'ErrorCount',
      Value: 1,
      Unit: 'Count',
      Timestamp: new Date(),
      Dimensions: [
        { Name: 'ErrorType', Value: errorType },
        ...(shortCode ? [{ Name: 'ShortCode', Value: shortCode }] : []),
      ],
    });

    await this.flush();
  }

  private async flush(): Promise<void> {
    if (this.metrics.length === 0) return;

    try {
      await cloudwatch.send(new PutMetricDataCommand({
        Namespace: this.namespace,
        MetricData: this.metrics,
      }));

      this.metrics = []; // Clear after successful send
    } catch (error) {
      console.error('Failed to send metrics:', error);
      // Don't throw - metrics failures shouldn't break the main functionality
    }
  }
}

// Singleton instance
export const metrics = new MetricsCollector();

2. Load Testing That Simulates Reality

// tests/load-test.ts - Load testing that helps catch scaling issues
import { performance } from 'perf_hooks';

interface LoadTestConfig {
  baseUrl: string;
  concurrentUsers: number;
  testDurationMs: number;
  rampUpMs: number;
  shortCodes: string[];
}

interface LoadTestResult {
  totalRequests: number;
  successfulRequests: number;
  failedRequests: number;
  averageResponseTime: number;
  p50ResponseTime: number;
  p95ResponseTime: number;
  p99ResponseTime: number;
  errorsPerSecond: number;
  requestsPerSecond: number;
}

export async function runLoadTest(config: LoadTestConfig): Promise<LoadTestResult> {
  const results: Array<{
    success: boolean;
    responseTime: number;
    timestamp: number;
    error?: string;
  }> = [];

  const startTime = performance.now();
  const endTime = startTime + config.testDurationMs;
  
  // Create promise for each concurrent user
  const userPromises = Array.from({ length: config.concurrentUsers }, async (_, userIndex) => {
    // Stagger user start times during ramp-up
    const userStartDelay = (config.rampUpMs * userIndex) / config.concurrentUsers;
    await sleep(userStartDelay);
    
    while (performance.now() < endTime) {
      const requestStart = performance.now();
      
      try {
        // Random short code selection
        const shortCode = config.shortCodes[Math.floor(Math.random() * config.shortCodes.length)];
        const url = `${config.baseUrl}/${shortCode}`;
        
        const response = await fetch(url, {
          method: 'GET',
          redirect: 'manual', // Don't follow redirects - we just want timing
        });
        
        const responseTime = performance.now() - requestStart;
        
        results.push({
          success: response.status >= 200 && response.status < 400,
          responseTime,
          timestamp: performance.now(),
        });
        
      } catch (error) {
        const responseTime = performance.now() - requestStart;
        
        results.push({
          success: false,
          responseTime,
          timestamp: performance.now(),
          error: error instanceof Error ? error.message : String(error),
        });
      }
      
      // Wait before next request (adjust for desired load)
      await sleep(100 + Math.random() * 200); // 100-300ms between requests per user
    }
  });

  // Wait for all users to complete
  await Promise.all(userPromises);
  
  // Calculate statistics
  const successfulResults = results.filter(r => r.success);
  const responseTimes = successfulResults.map(r => r.responseTime);
  responseTimes.sort((a, b) => a - b);
  
  const totalDurationSec = (performance.now() - startTime) / 1000;
  
  return {
    totalRequests: results.length,
    successfulRequests: successfulResults.length,
    failedRequests: results.length - successfulResults.length,
    averageResponseTime: responseTimes.reduce((a, b) => a + b, 0) / responseTimes.length,
    p50ResponseTime: responseTimes[Math.floor(responseTimes.length * 0.5)],
    p95ResponseTime: responseTimes[Math.floor(responseTimes.length * 0.95)],
    p99ResponseTime: responseTimes[Math.floor(responseTimes.length * 0.99)],
    errorsPerSecond: (results.length - successfulResults.length) / totalDurationSec,
    requestsPerSecond: results.length / totalDurationSec,
  };
}

async function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

// Example usage - run this before every deployment
async function validatePerformance() {
  console.log('Running pre-deployment load test...');
  
  const testConfig: LoadTestConfig = {
    baseUrl: 'https://staging-links.yourcompany.com',
    concurrentUsers: 50,
    testDurationMs: 60 * 1000, // 1 minute
    rampUpMs: 10 * 1000,  // 10 second ramp-up
    shortCodes: ['test1', 'test2', 'test3', 'popular-link', 'campaign-2024'],
  };
  
  const results = await runLoadTest(testConfig);
  
  // Performance assertions
  const maxAcceptableP95 = 500; // 500ms P95 response time
  const maxAcceptableErrorRate = 0.01; // 1% error rate
  
  if (results.p95ResponseTime > maxAcceptableP95) {
    throw new Error(`P95 response time too high: ${results.p95ResponseTime}ms > ${maxAcceptableP95}ms`);
  }
  
  const errorRate = results.failedRequests / results.totalRequests;
  if (errorRate > maxAcceptableErrorRate) {
    throw new Error(`Error rate too high: ${(errorRate * 100).toFixed(2)}% > ${(maxAcceptableErrorRate * 100)}%`);
  }
  
  console.log('Load test passed:', results);
}

Blue-Green Deployments: Deploy Without Fear

A deployment strategy that reduces deployment anxiety:

// deployment/blue-green-deploy.ts
import * as aws from '@aws-sdk/client-route53';
import * as lambda from '@aws-sdk/client-lambda';

interface DeploymentConfig {
  stage: 'blue' | 'green';
  domainName: string;
  hostedZoneId: string;
  healthCheckUrl: string;
}

export class BlueGreenDeployment {
  private route53 = new aws.Route53Client({});
  private lambdaClient = new lambda.LambdaClient({});

  async deployNewVersion(config: DeploymentConfig): Promise<void> {
    console.log(`Starting ${config.stage} deployment...`);
    
    // Step 1: Deploy new infrastructure
    await this.deployCDKStack(config.stage);
    
    // Step 2: Warm up the new environment
    await this.warmUpEnvironment(config);
    
    // Step 3: Run health checks
    await this.runHealthChecks(config.healthCheckUrl);
    
    // Step 4: Gradually shift traffic
    await this.shiftTraffic(config, [10, 25, 50, 100]);
    
    console.log(`${config.stage} deployment completed successfully`);
  }

  private async deployCDKStack(stage: string): Promise<void> {
    // This would typically use CDK CLI or AWS SDK to deploy
    console.log(`Deploying CDK stack for ${stage}...`);
    
    // Example: exec CDK deploy command
    const { spawn } = await import('child_process');
    
    return new Promise((resolve, reject) => {
      const deploy = spawn('npx', ['cdk', 'deploy', '--all', '--context', `stage=${stage}`], {
        stdio: 'inherit',
      });
      
      deploy.on('close', (code) => {
        if (code === 0) {
          resolve();
        } else {
          reject(new Error(`CDK deploy failed with code ${code}`));
        }
      });
    });
  }

  private async warmUpEnvironment(config: DeploymentConfig): Promise<void> {
    console.log('Warming up Lambda functions...');
    
    // Get all Lambda functions for this stage
    const functions = await this.lambdaClient.send(new lambda.ListFunctionsCommand({
      Marker: undefined,
      MaxItems: 100,
    }));
    
    const stageFunctions = functions.Functions?.filter(fn => 
      fn.FunctionName?.includes(config.stage)
    ) || [];
    
    // Warm up each function
    const warmUpPromises = stageFunctions.map(async (fn) => {
      if (!fn.FunctionName) return;
      
      try {
        await this.lambdaClient.send(new lambda.InvokeCommand({
          FunctionName: fn.FunctionName,
          Payload: JSON.stringify({
            source: 'warm-up',
            warmUp: true,
          }),
        }));
        
        console.log(`Warmed up ${fn.FunctionName}`);
      } catch (error) {
        console.warn(`[WARN] Failed to warm up ${fn.FunctionName}:`, error);
      }
    });
    
    await Promise.all(warmUpPromises);
  }

  private async runHealthChecks(healthCheckUrl: string): Promise<void> {
    console.log('Running health checks...');
    
    const checks = [
      { name: 'Basic redirect', path: '/test-redirect' },
      { name: 'API health', path: '/api/health' },
      { name: '404 handling', path: '/non-existent-link' },
    ];
    
    for (const check of checks) {
      const url = `${healthCheckUrl}${check.path}`;
      const response = await fetch(url);
      
      // Different expectations for different endpoints
      const expectedStatus = check.path === '/non-existent-link' ? 404 : 200;
      
      if (response.status !== expectedStatus) {
        throw new Error(`Health check failed for ${check.name}: ${response.status}`);
      }
      
      console.log(`${check.name} health check passed`);
    }
  }

  private async shiftTraffic(
    config: DeploymentConfig, 
    trafficPercentages: number[]
  ): Promise<void> {
    for (const percentage of trafficPercentages) {
      console.log(`Shifting ${percentage}% traffic to ${config.stage}...`);
      
      // Update Route53 weighted routing
      await this.updateRoute53WeightedRecord(config, percentage);
      
      // Wait for DNS propagation and monitoring
      await this.sleep(120000); // 2 minutes
      
      // Check error rates during traffic shift
      await this.monitorErrorRates(config);
      
      console.log(`${percentage}% traffic shifted successfully`);
    }
  }

  private async updateRoute53WeightedRecord(
    config: DeploymentConfig, 
    weight: number
  ): Promise<void> {
    const oppositeWeight = 100 - weight;
    const oppositeStage = config.stage === 'blue' ? 'green' : 'blue';
    
    // Update current stage weight
    await this.route53.send(new aws.ChangeResourceRecordSetsCommand({
      HostedZoneId: config.hostedZoneId,
      ChangeBatch: {
        Changes: [{
          Action: 'UPSERT',
          ResourceRecordSet: {
            Name: config.domainName,
            Type: 'CNAME',
            SetIdentifier: config.stage,
            Weight: weight,
            TTL: 60, // Short TTL for quick changes
            ResourceRecords: [{ 
              Value: `${config.stage}-api.example.com` 
            }],
          },
        }],
      },
    }));

    // Update opposite stage weight
    await this.route53.send(new aws.ChangeResourceRecordSetsCommand({
      HostedZoneId: config.hostedZoneId,
      ChangeBatch: {
        Changes: [{
          Action: 'UPSERT',
          ResourceRecordSet: {
            Name: config.domainName,
            Type: 'CNAME',
            SetIdentifier: oppositeStage,
            Weight: oppositeWeight,
            TTL: 60,
            ResourceRecords: [{ 
              Value: `${oppositeStage}-api.example.com` 
            }],
          },
        }],
      },
    }));
  }

  private async monitorErrorRates(config: DeploymentConfig): Promise<void> {
    // This would integrate with CloudWatch to check error rates
    // and automatically roll back if error rates exceed threshold
    
    console.log('Monitoring error rates...');
    
    // Example: Check CloudWatch metrics
    // If error rate > 1%, rollback
    // If response time P95 > 500ms, rollback
    
    await this.sleep(30000); // Monitor for 30 seconds
  }

  async rollback(config: DeploymentConfig): Promise<void> {
    console.log(`Rolling back ${config.stage} deployment...`);
    
    // Shift all traffic back to stable version
    const stableStage = config.stage === 'blue' ? 'green' : 'blue';
    await this.updateRoute53WeightedRecord({
      ...config,
      stage: stableStage,
    }, 100);
    
    console.log('Rollback completed');
  }

  private async sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

Production Optimization Considerations

Running production infrastructure reveals important patterns about scaling and cost management:

1. Conservative provisioning with aggressive monitoring Start with minimal capacity and rely on auto-scaling. Over-provisioning increases costs without improving reliability for most workloads.

2. Cold start impact on user experience Even 2-3 seconds of cold start latency significantly degrades redirect performance. Provisioned concurrency for critical paths often justifies the additional cost.

3. DynamoDB auto-scaling timing Auto-scaling takes 5-10 minutes to increase capacity but scales down quickly. Setting target utilization at 70% instead of 90% provides buffer for traffic spikes.

4. Business metrics over technical metrics Tracking “redirects per campaign” and “conversion-generating links” provides more actionable insights than raw “Lambda invocations.” Business context helps prioritize optimization efforts.

5. Staging load testing effectiveness Comprehensive load testing catches most production issues, but real user patterns often differ from synthetic tests. Focus on simulating actual traffic patterns rather than theoretical peak loads.

Production Metrics That Matter

Here are the dashboards that provide useful daily insights:

// lib/production-dashboard.ts
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';

export class ProductionDashboard extends Construct {
  constructor(scope: Construct, id: string) {
    super(scope, id);

    new cloudwatch.Dashboard(this, 'LinkShortenerProduction', {
      dashboardName: 'LinkShortener-Production-Health',
      widgets: [
        // Row 1: Business metrics
        [
          new cloudwatch.SingleValueWidget({
            title: 'Redirects (24h)',
            metrics: [
              new cloudwatch.Metric({
                namespace: 'LinkShortener/Production',
                metricName: 'RedirectCount',
                statistic: 'Sum',
                period: cdk.Duration.hours(24),
              }),
            ],
            width: 6,
          }),
          
          new cloudwatch.SingleValueWidget({
            title: 'Success Rate (24h)',
            metrics: [
              new cloudwatch.MathExpression({
                expression: '(successful / total) * 100',
                usingMetrics: {
                  successful: new cloudwatch.Metric({
                    namespace: 'LinkShortener/Production',
                    metricName: 'RedirectCount',
                    dimensionsMap: { Status: 'Success' },
                    statistic: 'Sum',
                  }),
                  total: new cloudwatch.Metric({
                    namespace: 'LinkShortener/Production',
                    metricName: 'RedirectCount',
                    statistic: 'Sum',
                  }),
                },
              }),
            ],
            width: 6,
          }),
        ],

        // Row 2: Performance metrics
        [
          new cloudwatch.GraphWidget({
            title: 'Response Time Percentiles',
            left: [
              new cloudwatch.Metric({
                namespace: 'LinkShortener/Production',
                metricName: 'RedirectResponseTime',
                statistic: 'p50',
                period: cdk.Duration.minutes(5),
                label: 'P50',
              }),
              new cloudwatch.Metric({
                namespace: 'LinkShortener/Production', 
                metricName: 'RedirectResponseTime',
                statistic: 'p95',
                period: cdk.Duration.minutes(5),
                label: 'P95',
              }),
              new cloudwatch.Metric({
                namespace: 'LinkShortener/Production',
                metricName: 'RedirectResponseTime', 
                statistic: 'p99',
                period: cdk.Duration.minutes(5),
                label: 'P99',
              }),
            ],
            width: 12,
          }),
        ],

        // Row 3: Infrastructure health
        [
          new cloudwatch.GraphWidget({
            title: 'DynamoDB Throttling',
            left: [
              new cloudwatch.Metric({
                namespace: 'AWS/DynamoDB',
                metricName: 'ReadThrottledRequests',
                dimensionsMap: { TableName: 'LinksTable' },
                statistic: 'Sum',
              }),
              new cloudwatch.Metric({
                namespace: 'AWS/DynamoDB',
                metricName: 'WriteThrottledRequests',
                dimensionsMap: { TableName: 'LinksTable' },
                statistic: 'Sum',
              }),
            ],
            width: 6,
          }),
          
          new cloudwatch.GraphWidget({
            title: 'Lambda Cold Starts',
            left: [
              new cloudwatch.Metric({
                namespace: 'LinkShortener/Production',
                metricName: 'RedirectCount',
                dimensionsMap: { ColdStart: 'true' },
                statistic: 'Sum',
                period: cdk.Duration.minutes(5),
              }),
            ],
            width: 6,
          }),
        ],
      ],
    });
  }
}

Series Next Steps

In Part 5, we’ll tackle the final frontier: scaling to handle millions of redirects per day, cost optimization at scale, and the operational practices that let a small team manage a high-traffic service.

We’ll cover advanced topics like multi-region deployments, database sharding strategies, and monitoring that alerts you before users notice problems.

References

AWS CDK v2 Developer Guide - Official CDK documentation covering constructs, environments, and deployment best practices.
CDK Pipelines: CI/CD for AWS CDK Applications - Guide to building self-mutating pipelines for multi-environment deployments with CDK.
Amazon CloudFront Developer Guide - Documentation for configuring CDN distributions, caching, and performance optimization.
Amazon Route 53 Developer Guide - Reference for DNS routing, health checks, and failover configuration used with custom domains.
AWS Lambda Performance Optimization - Official guidance on cold start mitigation, memory tuning, and invocation performance.
Amazon CloudWatch Alarms - Documentation on setting up metric-based alarms for operational monitoring and incident response.

The infrastructure we’ve built scales well, but there are specific patterns that help services handle increasing load efficiently.

AWS CDK Link Shortener: From Zero to Production

A comprehensive 5-part series on building a production-grade link shortener service with AWS CDK, Node.js Lambda, and DynamoDB. Real war stories, performance optimization, and cost management included.

Progress 4/5 posts completed

Previous Advanced Features & Security Next Scaling & Maintenance

All Posts in This Series

Part 1: Project Setup & Basic Infrastructure

Part 2: Core Functionality & API Development

Part 3: Advanced Features & Security

Part 4: Production Deployment & Optimization

Part 5: Scaling & Maintenance

View series →

Edge Computing with AWS: CloudFront Functions vs Lambda@Edge

A technical guide to choosing and implementing AWS edge computing for global apps, with practical examples and cost optimization strategies.

awscloudfrontlambda +6

December 25, 2025

DynamoDB Throttling: Hot Partition Fixes, Write Sharding, and Retry Strategies

Strategies to prevent and handle DynamoDB throttling in Single Table Design: partition key design, write sharding, capacity modes, DAX, and retry patterns.

dynamodbawsrate-limiting +5

January 28, 2026

AWS Cost Optimization Toolkit - Practical Strategies for Production Workloads

A comprehensive guide to reducing AWS costs by 40-70% through systematic optimization using native AWS services, automation, and proven implementation patterns.

awscost-optimizationfinops +5

December 18, 2025

LangChain in Production: Patterns That Work and Anti-Patterns That Don't

Lessons from running LangChain in production: the anti-patterns that cause failures, the patterns that work, with code examples and cost optimization strategies.

langchainllmproduction +5

December 3, 2025

Database Query Profiling: Systematic Optimization Journey

How systematic database profiling and optimization reduced infrastructure costs significantly. PostgreSQL and MongoDB performance insights and practical patterns.

database-optimizationpostgresqlmongodb +7

September 8, 2025