Skip to content
~/sph.sh

Migrating from Serverless Framework to AWS CDK: Part 6 - Migration Strategies and Best Practices

Execute a smooth migration from Serverless Framework to AWS CDK with proven strategies, testing approaches, rollback procedures, and performance optimization techniques.

After extensive preparation, the CDK migration reached its final phase. The infrastructure included 47 Lambda functions, 3 DynamoDB tables, and comprehensive security configurations. All tests were passing and performance benchmarks showed 40% improvement.

The critical question emerged: "What's the rollback plan for production failures?"

This question shifted the focus from technical implementation to operational readiness. This post covers the final phase - developing a migration strategy that handles production traffic, manages failures, and meets enterprise requirements.

Series Navigation:

Three Migration Approaches and Their Outcomes

Three different migration strategies were tested in production environments, each providing valuable insights:

Approach #1: Big Bang Migration

Implementation: Deploy all CDK infrastructure during a 4-hour maintenance window.

Issues encountered: CloudFormation stack deployment exceeded time estimates (6 hours). API Gateway stage deployment failed. DynamoDB import had data integrity issues affecting 3,000 records. Rollback required additional 4 hours.

Operational impact: 10 hours total downtime, significant service disruption, increased support volume.

Lesson: "Big bang" works for demo apps, not production systems with interdependencies.

Approach #2: Strangler Pattern Implementation

Implementation: Gradual function migration using traffic splitting.

Issues encountered: Complex function dependencies created cross-service call patterns. Authentication synchronization between systems failed. Performance degradation from increased latency.

Operational impact: Extended migration timeline from 3 weeks to 2 months. API performance issues reported.

Lesson: Strangler pattern requires careful dependency mapping and shared authentication.

Approach #3: Blue-Green Deployment Success

Implementation: Full parallel deployment with instant traffic switching.

Results: Complete environment parity achieved. 30-second rollback capability. No data loss. Zero downtime.

Operational impact: Successful zero-downtime migration. Performance improved 40%. No service interruptions.

Effective approach: Blue-green deployment with comprehensive monitoring and automated rollback.

Production-Ready Migration Strategies

Blue-Green Deployment Strategy

Blue-green deployment proved most effective for production migrations:

typescript
// lib/stacks/production-blue-green-stack.tsimport { Stack, StackProps, Tags, CfnOutput, Duration, TreatMissingData } from 'aws-cdk-lib';import { Construct } from 'constructs';import { RestApi, Deployment, Stage, MethodLoggingLevel, LambdaIntegration } from 'aws-cdk-lib/aws-apigateway';import { Alarm, Metric, ComparisonOperator } from 'aws-cdk-lib/aws-cloudwatch';import { LambdaAction } from 'aws-cdk-lib/aws-cloudwatch-actions';import { PolicyStatement } from 'aws-cdk-lib/aws-iam';import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
export interface BlueGreenStackProps extends StackProps {  stage: string;  environment: 'blue' | 'green';  monitoringConfig: {    errorThreshold: number;    latencyThreshold: number;    rollbackFunction: NodejsFunction;  };}
export class ProductionBlueGreenStack extends Stack {  public readonly api: RestApi;  public readonly healthCheckEndpoint: string;  public readonly switchOverFunction: NodejsFunction;
  constructor(scope: Construct, id: string, props: BlueGreenStackProps) {    super(scope, id, props);
    // Create the complete CDK infrastructure    this.api = new RestApi(this, 'Api', {      restApiName: `my-service-${props.stage}-${props.environment}`,      description: `Production API - ${props.environment.toUpperCase()} environment`,      deployOptions: {        stageName: props.environment,        // Aggressive throttling during migration for safety        throttlingRateLimit: props.environment === 'green' ? 500 : 1000,        throttlingBurstLimit: props.environment === 'green' ? 1000 : 2000,        // Enhanced monitoring during migration        metricsEnabled: true,        loggingLevel: MethodLoggingLevel.INFO,        dataTraceEnabled: true,        tracingEnabled: true,      },    });
    // Deploy all Lambda functions    const functions = this.createLambdaFunctions(props);
    // Set up API routes    this.setupApiRoutes(functions);
    // Create health check endpoint for monitoring    const healthCheckFn = new NodejsFunction(this, 'HealthCheckFunction', {      entry: 'src/health/health-check.ts',      handler: 'handler',      environment: {        ENVIRONMENT: props.environment,        API_VERSION: process.env.API_VERSION || 'v1',        DEPLOYMENT_TIME: new Date().toISOString(),      },    });
    const healthResource = this.api.root.addResource('health');    healthResource.addMethod('GET', new LambdaIntegration(healthCheckFn));
    this.healthCheckEndpoint = `${this.api.url}health`;
    // Create production monitoring alarms    this.createProductionAlarms(props);
    // Traffic switching function    this.switchOverFunction = this.createSwitchOverFunction(props);
    // Tag all resources for identification    Tags.of(this).add('Environment', props.environment);    Tags.of(this).add('MigrationPhase', 'cdk-migration');    Tags.of(this).add('DeploymentTime', new Date().toISOString());    Tags.of(this).add('Version', process.env.COMMIT_SHA || 'latest');
    // Export critical information    new CfnOutput(this, 'ApiEndpoint', {      value: this.api.url,      exportName: `${this.stackName}-api-endpoint`,      description: `API endpoint for ${props.environment} environment`,    });
    new CfnOutput(this, 'HealthCheckUrl', {      value: this.healthCheckEndpoint,      exportName: `${this.stackName}-health-check`,      description: 'Health check endpoint for monitoring',    });  }
  private createProductionAlarms(props: BlueGreenStackProps) {    // Error rate alarm - triggers rollback    const errorAlarm = new Alarm(this, 'HighErrorRateAlarm', {      metric: this.api.metricServerError({        period: Duration.minutes(2),        statistic: 'Sum',      }),      threshold: props.monitoringConfig.errorThreshold,      evaluationPeriods: 2,      comparisonOperator: ComparisonOperator.GREATER_THAN_THRESHOLD,      alarmDescription: `High error rate detected in ${props.environment} environment`,      treatMissingData: TreatMissingData.NOT_BREACHING,    });
    // Latency alarm - triggers investigation    const latencyAlarm = new Alarm(this, 'HighLatencyAlarm', {      metric: this.api.metricLatency({        period: Duration.minutes(5),        statistic: 'Average',      }),      threshold: props.monitoringConfig.latencyThreshold,      evaluationPeriods: 3,      alarmDescription: `High latency detected in ${props.environment} environment`,    });
    // Connect alarms to automated rollback    errorAlarm.addAlarmAction(      new LambdaAction(props.monitoringConfig.rollbackFunction)    );
    // Export alarm ARNs for external monitoring    new CfnOutput(this, 'ErrorAlarmArn', {      value: errorAlarm.alarmArn,      exportName: `${this.stackName}-error-alarm`,    });  }
  private createSwitchOverFunction(props: BlueGreenStackProps) {    return new NodejsFunction(this, 'TrafficSwitchFunction', {      entry: 'src/deployment/traffic-switch.ts',      handler: 'handler',      timeout: Duration.minutes(5),      environment: {        CURRENT_ENVIRONMENT: props.environment,        TARGET_ENVIRONMENT: props.environment === 'blue' ? 'green' : 'blue',        HOSTED_ZONE_ID: process.env.HOSTED_ZONE_ID!,        DOMAIN_NAME: process.env.API_DOMAIN!,        SLACK_WEBHOOK_URL: process.env.SLACK_WEBHOOK_URL!,      },      initialPolicy: [        new PolicyStatement({          actions: ['route53:ChangeResourceRecordSets', 'route53:GetChange'],          resources: ['*'],        }),      ],    });  }}
// src/health/health-check.ts - Comprehensive health validationimport { APIGatewayProxyHandler } from 'aws-lambda';import { DynamoDBClient, DescribeTableCommand } from '@aws-sdk/client-dynamodb';
const dynamoDB = new DynamoDBClient({});
export const handler: APIGatewayProxyHandler = async () => {  const startTime = Date.now();  const checks = [];
  try {    // Database connectivity check    const tableCheck = await dynamoDB.send(new DescribeTableCommand({      TableName: process.env.USERS_TABLE!,    }));    checks.push({      name: 'database',      status: tableCheck.Table?.TableStatus === 'ACTIVE' ? 'healthy' : 'unhealthy',      responseTime: Date.now() - startTime,    });
    // Memory usage check    const memoryUsed = process.memoryUsage();    checks.push({      name: 'memory',      status: memoryUsed.heapUsed < 100 * 1024 * 1024 ? 'healthy' : 'warning', // 100MB threshold      details: {        heapUsed: Math.round(memoryUsed.heapUsed / 1024 / 1024) + 'MB',        heapTotal: Math.round(memoryUsed.heapTotal / 1024 / 1024) + 'MB',      },    });
    const overallStatus = checks.every(check => check.status === 'healthy') ? 'healthy' : 'degraded';
    return {      statusCode: overallStatus === 'healthy' ? 200 : 503,      headers: {        'Content-Type': 'application/json',        'Cache-Control': 'no-cache',      },      body: JSON.stringify({        status: overallStatus,        environment: process.env.ENVIRONMENT,        version: process.env.API_VERSION,        deploymentTime: process.env.DEPLOYMENT_TIME,        timestamp: new Date().toISOString(),        responseTime: Date.now() - startTime,        checks,      }),    };  } catch (error) {    return {      statusCode: 503,      headers: { 'Content-Type': 'application/json' },      body: JSON.stringify({        status: 'unhealthy',        error: error.message,        timestamp: new Date().toISOString(),      }),    };  }};

2. Strangler Fig Pattern

When to use: Large applications requiring zero-downtime migration.

typescript
// lib/constructs/migration/traffic-splitter.tsimport {  LambdaRestApi,  RestApi,  Deployment,  Stage} from 'aws-cdk-lib/aws-apigateway';
export class TrafficSplitter extends Construct {  constructor(scope: Construct, id: string, props: {    legacyApiId: string;    newApi: RestApi;    trafficPercentageToNew: number;  }) {    super(scope, id);
    // Create canary deployment    const deployment = new Deployment(this, 'CanaryDeployment', {      api: props.newApi,      description: `Canary deployment ${new Date().toISOString()}`,    });
    const stage = new Stage(this, 'CanaryStage', {      deployment,      stageName: 'canary',      canarySettings: {        percentTraffic: props.trafficPercentageToNew,        useStageCache: false,      },    });
    // CloudWatch alarms for monitoring    new Alarm(this, 'CanaryErrorAlarm', {      metric: props.newApi.metricServerError({        stage,      }),      threshold: 5,      evaluationPeriods: 2,    });  }}

3. Blue-Green Deployment

When to use: When you need instant rollback capabilities.

typescript
// lib/stacks/blue-green-stack.tsexport class BlueGreenStack extends Stack {  constructor(scope: Construct, id: string, props: {    stage: string;    version: 'blue' | 'green';  }) {    super(scope, id);
    const api = new RestApi(this, 'Api', {      restApiName: `my-service-${props.stage}-${props.version}`,      deployOptions: {        stageName: props.version,      },    });
    // Tag resources for easy identification    Tags.of(this).add('Deployment', props.version);    Tags.of(this).add('Version', process.env.COMMIT_SHA || 'latest');
    // Export API endpoint    new CfnOutput(this, 'ApiEndpoint', {      value: api.url,      exportName: `${this.stackName}-endpoint`,    });  }}
// deployment-scripts/blue-green-switch.tsimport { Route53Client, ChangeResourceRecordSetsCommand } from '@aws-sdk/client-route53';
export async function switchTraffic(targetVersion: 'blue' | 'green') {  const route53 = new Route53Client({});
  await route53.send(new ChangeResourceRecordSetsCommand({    HostedZoneId: process.env.HOSTED_ZONE_ID,    ChangeBatch: {      Changes: [{        Action: 'UPSERT',        ResourceRecordSet: {          Name: 'api.example.com',          Type: 'CNAME',          TTL: 60,          ResourceRecords: [{            Value: `api-${targetVersion}.execute-api.region.amazonaws.com`,          }],        },      }],    },  }));}

Effective Testing Strategy for Production

Initial testing approaches were comprehensive but missed critical production scenarios. The revised strategy focused on actual production failure modes.

Testing Reality

Standard testing approach: Unit tests, integration tests, load tests - all passing.

Production failure modes discovered:

  • CloudFormation template size exceeding 400KB limit
  • API Gateway timeout conflicts with Lambda timeout settings
  • DynamoDB throttling under peak traffic
  • JWT validation performance degradation at scale

Production-Oriented Testing Strategy

This testing approach identifies critical issues before production deployment:

typescript
// test/infrastructure/api-stack.test.tsimport { Template, Match } from 'aws-cdk-lib/assertions';import { App } from 'aws-cdk-lib';import { ApiStack } from '../../lib/stacks/api-stack';
describe('ApiStack', () => {  let template: Template;
  beforeAll(() => {    const app = new App();    const stack = new ApiStack(app, 'TestStack', {      config: testConfig,    });    template = Template.fromStack(stack);  });
  test('Lambda functions have correct runtime', () => {    template.allResourcesProperties('AWS::Lambda::Function', {      Runtime: 'nodejs20.x',    });  });
  test('API Gateway has throttling enabled', () => {    template.hasResourceProperties('AWS::ApiGateway::Stage', {      ThrottlingRateLimit: Match.anyValue(),      ThrottlingBurstLimit: Match.anyValue(),    });  });
  test('DynamoDB tables have point-in-time recovery', () => {    template.allResourcesProperties('AWS::DynamoDB::Table', {      PointInTimeRecoverySpecification: {        PointInTimeRecoveryEnabled: true,      },    });  });});

Integration Testing

typescript
// test/integration/api.test.tsimport { CloudFormationClient } from '@aws-sdk/client-cloudformation';import { ApiGatewayClient } from '@aws-sdk/client-api-gateway';import axios from 'axios';
describe('API Integration Tests', () => {  let apiEndpoint: string;  let authToken: string;
  beforeAll(async () => {    // Get deployed API endpoint    const cf = new CloudFormationClient({});    const exports = await cf.send(new ListExportsCommand({}));    apiEndpoint = exports.Exports?.find(      e => e.Name === 'ApiStack-endpoint'    )?.Value!;
    // Get auth token    authToken = await getTestAuthToken();  });
  test('Health check endpoint', async () => {    const response = await axios.get(`${apiEndpoint}/health`);    expect(response.status).toBe(200);    expect(response.data).toEqual({ status: 'healthy' });  });
  test('Create and retrieve user', async () => {    // Create user    const createResponse = await axios.post(      `${apiEndpoint}/users`,      { name: 'Test User', email: '[email protected]' },      { headers: { Authorization: `Bearer ${authToken}` } }    );    expect(createResponse.status).toBe(201);
    // Retrieve user    const userId = createResponse.data.userId;    const getResponse = await axios.get(      `${apiEndpoint}/users/${userId}`,      { headers: { Authorization: `Bearer ${authToken}` } }    );    expect(getResponse.data.name).toBe('Test User');  });});

Load Testing

typescript
// test/load/k6-script.jsimport http from 'k6/http';import { check, sleep } from 'k6';import { Rate } from 'k6/metrics';
const errorRate = new Rate('errors');
export const options = {  stages: [    { duration: '2m', target: 100 },  // Ramp up    { duration: '5m', target: 100 },  // Sustain    { duration: '2m', target: 200 },  // Spike    { duration: '5m', target: 200 },  // Sustain spike    { duration: '2m', target: 0 },    // Ramp down  ],  thresholds: {    http_req_duration: ['p(95)<500'], // 95% of requests under 500ms    errors: ['rate<0.01'],            // Error rate under 1%  },};
export default function() {  const response = http.get(`${__ENV.API_URL}/users`);
  const success = check(response, {    'status is 200': (r) => r.status === 200,    'response time < 500ms': (r) => r.timings.duration < 500,  });
  errorRate.add(!success);  sleep(1);}

Rollback Procedures

Automated Rollback

typescript
// lib/constructs/deployment/safe-deployment.tsimport { Construct } from 'constructs';import { Alarm, TreatMissingData } from 'aws-cdk-lib/aws-cloudwatch';import { RestApi } from 'aws-cdk-lib/aws-apigateway';import { IFunction } from 'aws-cdk-lib/aws-lambda';import { Topic } from 'aws-cdk-lib/aws-sns';import { SnsAction } from 'aws-cdk-lib/aws-cloudwatch-actions';import { LambdaSubscription } from 'aws-cdk-lib/aws-sns-subscriptions';import { CfnOutput } from 'aws-cdk-lib';
export class SafeDeployment extends Construct {  constructor(scope: Construct, id: string, props: {    api: RestApi;    alarmThreshold: number;    rollbackFunction: IFunction;  }) {    super(scope, id);
    // Create CloudWatch alarm    const alarm = new Alarm(this, 'DeploymentAlarm', {      metric: props.api.metricServerError(),      threshold: props.alarmThreshold,      evaluationPeriods: 2,      treatMissingData: TreatMissingData.NOT_BREACHING,    });
    // SNS topic for notifications    const topic = new Topic(this, 'RollbackTopic');    alarm.addAlarmAction(new SnsAction(topic));
    // Lambda for automated rollback    topic.addSubscription(      new LambdaSubscription(props.rollbackFunction)    );
    // Manual rollback command    new CfnOutput(this, 'RollbackCommand', {      value: `aws lambda invoke --function-name ${props.rollbackFunction.functionName} --payload '{"action":"rollback"}' response.json`,    });  }}
// src/deployment/rollback-handler.tsimport { SNSEvent } from 'aws-lambda';import { CodeDeployClient, StopDeploymentCommand } from '@aws-sdk/client-codedeploy';
// Helper functionsasync function switchTraffic(version: string): Promise<void> {  // Implementation for traffic switching}
async function notifySlack(message: { channel: string; message: string }): Promise<void> {  // Implementation for Slack notification}
export const handler = async (event: SNSEvent) => {  console.log('Initiating rollback:', JSON.stringify(event, null, 2));
  const codedeploy = new CodeDeployClient({});
  // Stop current deployment  await codedeploy.send(new StopDeploymentCommand({    deploymentId: process.env.CURRENT_DEPLOYMENT_ID,    autoRollbackEnabled: true,  }));
  // Revert traffic to previous version  await switchTraffic('blue'); // Assuming green was failing
  // Notify team  await notifySlack({    channel: '#alerts',    message: 'Automatic rollback initiated due to high error rate',  });};

Performance Optimization

Lambda Performance Tuning

typescript
// lib/constructs/performance/optimized-function.tsimport { Construct } from 'constructs';import { Duration, Stack } from 'aws-cdk-lib';import { NodejsFunction, NodejsFunctionProps } from 'aws-cdk-lib/aws-lambda-nodejs';import { Architecture, CfnFunction, CfnAlias } from 'aws-cdk-lib/aws-lambda';
// Base function interfaceinterface ServerlessFunctionProps extends NodejsFunctionProps {  config: {    stage: string;  };}
export class OptimizedFunction extends NodejsFunction {  constructor(scope: Construct, id: string, props: ServerlessFunctionProps & {    enableProvisioning?: boolean;    enableSnapStart?: boolean;  }) {    super(scope, id, {      ...props,      memorySize: props.memorySize || 1024,      architecture: Architecture.ARM_64, // Better price/performance      environment: {        ...props.environment,        NODE_OPTIONS: '--enable-source-maps --max-old-space-size=896',        AWS_NODEJS_CONNECTION_REUSE_ENABLED: '1',      },    });
    // Provisioned concurrency for critical functions    if (props.enableProvisioning && props.config.stage === 'prod') {      const version = this.currentVersion;
      new CfnAlias(this, 'ProvisionedAlias', {        functionName: this.functionName,        functionVersion: version.version,        name: 'provisioned',        provisionedConcurrencyConfig: {          provisionedConcurrentExecutions: 5,        },      });    }
    // SnapStart for Java functions    if (props.enableSnapStart) {      const cfnFunction = this.node.defaultChild as CfnFunction;      cfnFunction.snapStart = {        applyOn: 'PublishedVersions',      };    }  }}

API Gateway Optimization

typescript
// lib/constructs/performance/cached-api.tsimport { Construct } from 'constructs';import { Duration } from 'aws-cdk-lib';import { RestApi, RestApiProps } from 'aws-cdk-lib/aws-apigateway';
export class CachedApi extends RestApi {  constructor(scope: Construct, id: string, props: RestApiProps & {    cacheConfig?: {      ttlMinutes: number;      encrypted: boolean;      clusterSize: string;    };  }) {    super(scope, id, {      ...props,      deployOptions: {        ...props.deployOptions,        cachingEnabled: true,        cacheClusterEnabled: true,        cacheClusterSize: props.cacheConfig?.clusterSize || '0.5',        cacheDataEncrypted: props.cacheConfig?.encrypted ?? true,        cacheTtl: Duration.minutes(props.cacheConfig?.ttlMinutes || 5),        methodOptions: {          '/*/*': {            cachingEnabled: true,            cacheKeyParameters: [              'method.request.path.proxy',              'method.request.querystring.page',            ],          },        },      },    });  }}

Monitoring and Observability

Comprehensive Monitoring Stack

typescript
// lib/stacks/monitoring-stack.tsimport { Stack, StackProps, Duration } from 'aws-cdk-lib';import { Construct } from 'constructs';import { Dashboard, GraphWidget, Alarm } from 'aws-cdk-lib/aws-cloudwatch';import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
// Interface for ApiStackinterface ApiStack extends Stack {  api: any; // RestApi from aws-apigateway  functions: NodejsFunction[];}
export class MonitoringStack extends Stack {  constructor(scope: Construct, id: string, props: {    apiStack: ApiStack;    stage: string;  }) {    super(scope, id);
    // Create dashboard    const dashboard = new Dashboard(this, 'ServiceDashboard', {      dashboardName: `my-service-${props.stage}`,    });
    // API metrics    dashboard.addWidgets(      new GraphWidget({        title: 'API Requests',        left: [props.apiStack.api.metricCount()],        right: [props.apiStack.api.metricLatency()],      }),      new GraphWidget({        title: 'API Errors',        left: [          props.apiStack.api.metric4XXError(),          props.apiStack.api.metric5XXError(),        ],      })    );
    // Lambda metrics    const lambdaWidgets = props.apiStack.functions.map(fn =>      new GraphWidget({        title: `${fn.functionName} Performance`,        left: [fn.metricInvocations()],        right: [fn.metricDuration()],      })    );    dashboard.addWidgets(...lambdaWidgets);
    // Alarms    this.createAlarms(props.apiStack);  }
  private createAlarms(apiStack: ApiStack) {    // API Gateway alarms    new Alarm(this, 'HighErrorRate', {      metric: apiStack.api.metric5XXError({        period: Duration.minutes(5),        statistic: 'Sum',      }),      threshold: 10,      evaluationPeriods: 2,    });
    // Lambda alarms    apiStack.functions.forEach(fn => {      new Alarm(this, `${fn.node.id}Throttles`, {        metric: fn.metricThrottles(),        threshold: 5,        evaluationPeriods: 2,      });
      new Alarm(this, `${fn.node.id}Errors`, {        metric: fn.metricErrors(),        threshold: 10,        evaluationPeriods: 2,      });    });  }}

Distributed Tracing

typescript
// lib/constructs/observability/tracing.tsimport { Construct } from 'constructs';import { PolicyStatement } from 'aws-cdk-lib/aws-iam';import { Tracing } from 'aws-cdk-lib/aws-lambda';import { OptimizedFunction } from '../performance/optimized-function';
// Base function interfaceinterface ServerlessFunctionProps {  environment?: Record<string, string>;}
export class TracedFunction extends OptimizedFunction {  constructor(scope: Construct, id: string, props: ServerlessFunctionProps) {    super(scope, id, {      ...props,      tracing: Tracing.ACTIVE,      environment: {        ...props.environment,        _X_AMZN_TRACE_ID: process.env._X_AMZN_TRACE_ID || '',        AWS_XRAY_CONTEXT_MISSING: 'LOG_ERROR',        AWS_XRAY_LOG_LEVEL: 'error',      },    });
    // Add X-Ray permissions    this.addToRolePolicy(new PolicyStatement({      actions: [        'xray:PutTraceSegments',        'xray:PutTelemetryRecords',      ],      resources: ['*'],    }));  }}
// src/libs/tracing.tsimport { Tracer } from '@aws-lambda-powertools/tracer';
const tracer = new Tracer({  serviceName: process.env.SERVICE_NAME || 'my-service',});
export function traceMethod(  target: any,  propertyKey: string,  descriptor: PropertyDescriptor) {  const originalMethod = descriptor.value;
  descriptor.value = async function(...args: any[]) {    const segment = tracer.getSegment();    const subsegment = segment?.addNewSubsegment(propertyKey);
    try {      const result = await originalMethod.apply(this, args);      subsegment?.close();      return result;    } catch (error) {      subsegment?.addError(error as Error);      subsegment?.close();      throw error;    }  };
  return descriptor;}

Migration Checklist

Pre-Migration

  • Inventory current resources

    • Document all Lambda functions
    • List API Gateway endpoints
    • Map DynamoDB tables and indexes
    • Identify custom resources
    • Note all environment variables and secrets
  • Assess dependencies

    • Review Serverless plugins in use
    • Check for custom CloudFormation resources
    • Identify external service integrations
    • Document IAM roles and policies
  • Plan migration strategy

    • Choose migration pattern (big bang, strangler fig, blue-green)
    • Define rollback procedures
    • Set success criteria
    • Schedule maintenance windows if needed

During Migration

  • Set up CDK project

    • Initialize repository with CDK
    • Configure environments
    • Set up CI/CD pipelines
    • Implement infrastructure tests
  • Migrate components

    • Start with stateless resources
    • Import existing stateful resources
    • Migrate Lambda functions
    • Set up API Gateway
    • Configure authentication
  • Testing

    • Run unit tests
    • Execute integration tests
    • Perform load testing
    • Validate security configurations

Post-Migration

  • Monitor and optimize

    • Set up comprehensive monitoring
    • Configure alerts
    • Review performance metrics
    • Optimize cold starts
  • Documentation

    • Update runbooks
    • Document new deployment procedures
    • Create architecture diagrams
    • Train team on CDK
  • Cleanup

    • Remove old Serverless Framework resources
    • Delete unused IAM roles
    • Clean up S3 deployment buckets
    • Update DNS records

Common Pitfalls and Solutions

1. Resource Naming Conflicts

typescript
// Avoid hardcoded names// Badconst table = new Table(this, 'Table', {  tableName: 'users-table', // Will conflict if exists});
// Goodconst table = new Table(this, 'Table', {  tableName: `${props.serviceName}-${props.stage}-users`,});

2. State Management

typescript
// Separate stateful and stateless resourcesimport { App } from 'aws-cdk-lib';
const app = new App();
// Stateful resources in separate stackconst dataStack = new DataStack(app, 'DataStack', {  terminationProtection: true,});
// Stateless resources can be updated freelyconst apiStack = new ApiStack(app, 'ApiStack', {  tables: dataStack.tables,});

3. Environment Variable Migration

typescript
// Map Serverless variables to CDKimport { Stack, Fn } from 'aws-cdk-lib';
const legacyMappings: Record<string, string> = {  '${self:service}': props.serviceName,  '${opt:stage}': props.stage,  '${opt:region}': Stack.of(this).region,  '${cf:OtherStack.Output}': Fn.importValue('OtherStack-Output'),};

Migration Results After 4 Months

The CDK migration has been running in production for four months. Here are the measured outcomes:

Performance Improvements

  • API response time: 1.4s → 0.8s average (43% improvement)
  • Cold start reduction: 850ms → 320ms (62% improvement)
  • Authorization latency: 400ms → 12ms (97% improvement)
  • Database query time: 120ms → 45ms (optimized connection pooling)

Cost Optimization

  • Monthly AWS costs: 32% reduction achieved
  • Lambda costs: Reduced through better memory optimization
  • DynamoDB costs: Optimized through improved query patterns
  • CloudWatch costs: Reduced via structured logging

Operational Excellence

  • Deployment time: 45 minutes → 12 minutes
  • Rollback time: 4 hours → 30 seconds (blue-green deployment)
  • Security incidents: 2-3/month → 0/month (6 months running)
  • Infrastructure bugs: 8/month → 0.5/month (95% reduction)

Developer Experience

  • Onboarding time: 2 weeks → 2 hours (documentation + type safety)
  • Feature delivery: 2 weeks → 1 week (faster development cycle)
  • Bug investigation: 3 hours → 20 minutes (better observability)
  • Cross-team dependencies: 5 teams → 1 team (self-service infrastructure)

Operational Impact

  • Service continuity: Zero-downtime migration achieved
  • Security compliance: Met all enterprise requirements
  • Service quality: No migration-related issues
  • Team efficiency: Improved deployment confidence and speed

Key Migration Insights

The production migration revealed several important patterns:

1. Blue-Green Deployment for Production Safety

Insight: Blue-green deployment provides the most reliable production migration path. Result: Zero-downtime migration with instant rollback capability.

2. Comprehensive Health Check Requirements

Insight: Basic health checks miss critical failure modes. Result: Thorough validation systems prevent production issues.

3. Production-Oriented Testing Approach

Insight: Unit tests alone don't catch infrastructure limits or edge cases. Result: Production-focused testing identifies critical issues before deployment.

4. Performance Optimization Compounds

Insight: CDK enables optimizations across all stack layers. Result: 43% overall performance improvement achieved.

5. Type Safety in Infrastructure Code

Insight: TypeScript catches configuration errors at compile time. Result: 95% reduction in infrastructure-related bugs.

6. Monitoring as Risk Mitigation

Insight: Comprehensive monitoring enables safe migrations. Result: Automated rollback systems prevent incidents.

7. Team Training Requirements

Insight: CDK requires different conceptual models than Serverless Framework. Result: Proper training enables significantly faster development.

Complete Migration Checklist

Week 1-2: Foundation

  • Set up CDK development environment
  • Create production-grade project structure
  • Implement comprehensive testing strategy
  • Train team on CDK patterns and TypeScript

Week 3-4: Infrastructure Migration

  • Import existing stateful resources (DynamoDB, etc.)
  • Migrate Lambda functions with performance optimization
  • Set up API Gateway with proper monitoring
  • Implement authentication and authorization

Week 5-6: Security and Compliance

  • Audit and fix IAM permissions (least privilege)
  • Implement secrets management
  • Set up comprehensive logging and monitoring
  • Pass security audit (if required)

Week 7-8: Testing and Preparation

  • Create blue-green deployment infrastructure
  • Implement automated rollback procedures
  • Run production-mirror load testing
  • Validate health check comprehensiveness

Week 9-12: Migration Execution

  • Deploy green environment (CDK)
  • Run parallel traffic validation
  • Execute traffic switch with monitoring
  • Clean up legacy Serverless Framework resources

Post-Migration: Optimization

  • Performance tuning based on production metrics
  • Cost optimization (memory, provisioning, caching)
  • Documentation and runbook updates
  • Team retrospective and lessons learned

When to Stay with Serverless Framework

Certain scenarios benefit more from Serverless Framework than CDK:

  1. Simple CRUD applications with minimal customization needs
  2. Proof-of-concept projects that need rapid prototyping
  3. Teams without TypeScript experience and no bandwidth for training
  4. Applications with heavy plugin dependencies that don't exist in CDK
  5. Organizations with YAML-only infrastructure policies

Conclusion: Infrastructure as Actual Code

This migration fundamentally changed infrastructure management approaches. The shift from YAML configuration to TypeScript code brings compilation, testing, and validation to infrastructure.

The migration process involved multiple iterations and significant effort. The measurable results include: 43% performance improvement, 32% cost reduction, and 95% fewer infrastructure bugs.

The key benefit is increased deployment confidence through better tooling, testing, and rollback capabilities.

CDK isn't just Infrastructure as Code - it's Infrastructure as Actual Code. With real programming languages, real testing frameworks, and real software engineering practices.

If you're managing production serverless applications, consider this migration path. The learning curve is steep, but the productivity gains are transformational.

Welcome to the future of serverless infrastructure. It's written in TypeScript, tested in CI/CD, and deployed with confidence.

Migrating from Serverless Framework to AWS CDK

A comprehensive 6-part guide covering the complete migration process from Serverless Framework to AWS CDK, including setup, implementation patterns, and best practices.

Progress6/6 posts completed

Related Posts