Event-Driven Architecture Tools: A Comprehensive Guide to Kafka, SQS, EventBridge and Cloud Alternatives

Working with event-driven systems shows that choosing the right tool is less about hype and more about understanding trade-offs. Whether dealing with a simple queue or a complex event mesh, each tool has its sweet spot.

Let's dive into a comprehensive comparison of event-driven tools, message patterns, and their cloud equivalents.

Message Patterns: The Foundation

Before comparing tools, let's understand the fundamental patterns:

1-to-1 (Queue Pattern)

Message consumed by single consumer
Use cases: Task processing, work distribution
Tools: SQS, Azure Service Bus Queues, Cloud Tasks

1-to-Many (Topic/Fan-out Pattern)

Message delivered to multiple subscribers
Use cases: Event broadcasting, notifications
Tools: SNS, Azure Service Bus Topics, Cloud Pub/Sub

Many-to-Many (Event Mesh)

Complex routing between multiple producers/consumers
Use cases: Microservices communication
Tools: EventBridge, Azure Event Grid, Eventarc

The Complete Tool Landscape

Simple Queue Services

AWS SQS (Simple Queue Service)

What it excels at: Dead-simple queue operations, serverless integration, automatic scaling

Real production config that works:

typescript

// SQS with proper error handling and DLQconst params = {  QueueUrl: 'https://sqs.us-east-1.amazonaws.com/123/my-queue',  ReceiveMessageWaitTimeSeconds: 20,  // Long polling  MaxNumberOfMessages: 10,  VisibilityTimeout: 30,  // Processing window  MessageAttributeNames: ['All']};
// DLQ configurationconst dlqParams = {  QueueName: 'my-queue-dlq',  Attributes: {    MessageRetentionPeriod: '1209600',  // 14 days    RedrivePolicy: JSON.stringify({      deadLetterTargetArn: dlqArn,      maxReceiveCount: 3  // Retry 3 times before DLQ    })  }};

// SQS with proper error handling and DLQconst params = {  QueueUrl: 'https://sqs.us-east-1.amazonaws.com/123/my-queue',  ReceiveMessageWaitTimeSeconds: 20,  // Long polling  MaxNumberOfMessages: 10,  VisibilityTimeout: 30,  // Processing window  MessageAttributeNames: ['All']};
// DLQ configurationconst dlqParams = {  QueueName: 'my-queue-dlq',  Attributes: {    MessageRetentionPeriod: '1209600',  // 14 days    RedrivePolicy: JSON.stringify({      deadLetterTargetArn: dlqArn,      maxReceiveCount: 3  // Retry 3 times before DLQ    })  }};

Delivery guarantees:

Standard Queue: At-least-once (possible duplicates)
FIFO Queue: Exactly-once processing
Message ordering: FIFO only
Max message size: 1MB (upgraded from 256KB in Aug 2025)

Note: This 4x increase in message size limit benefits AI, IoT, and complex application integration workloads that require larger data exchanges. AWS Lambda's event source mapping has also been updated to support the new 1MB payloads.

When SQS shines:

Decoupling microservices
Batch job processing
Serverless architectures (Lambda triggers)
Simple task queues

Azure Service Bus Queues

Azure's equivalent to SQS with enterprise features:

csharp

// Service Bus with sessions and DLQ handlingvar client = new ServiceBusClient(connectionString);var processor = client.CreateProcessor(queueName, new ServiceBusProcessorOptions{    MaxConcurrentCalls = 10,    AutoCompleteMessages = false,    MaxAutoLockRenewalDuration = TimeSpan.FromMinutes(5),    SubQueue = SubQueue.DeadLetter  // Access DLQ});
// Message with duplicate detectionvar message = new ServiceBusMessage(body){    MessageId = Guid.NewGuid().ToString(),  // For deduplication    SessionId = sessionId,  // For ordered processing    TimeToLive = TimeSpan.FromMinutes(5)};

// Service Bus with sessions and DLQ handlingvar client = new ServiceBusClient(connectionString);var processor = client.CreateProcessor(queueName, new ServiceBusProcessorOptions{    MaxConcurrentCalls = 10,    AutoCompleteMessages = false,    MaxAutoLockRenewalDuration = TimeSpan.FromMinutes(5),    SubQueue = SubQueue.DeadLetter  // Access DLQ});
// Message with duplicate detectionvar message = new ServiceBusMessage(body){    MessageId = Guid.NewGuid().ToString(),  // For deduplication    SessionId = sessionId,  // For ordered processing    TimeToLive = TimeSpan.FromMinutes(5)};

Key differences from SQS:

Built-in sessions for ordered processing
Duplicate detection (configurable window)
Scheduled messages
Message size: 256KB (standard), 100MB (premium)

Google Cloud Tasks

GCP's task queue with HTTP target integration:

typescript

import { CloudTasksClient } from '@google-cloud/tasks';
const client = new CloudTasksClient();const parent = client.queuePath(project, location, queue);
const task = {    httpRequest: {        httpMethod: 'POST',        url: 'https://example.com/process',        headers: { 'Content-Type': 'application/json' },        body: Buffer.from(JSON.stringify(payload))    },    scheduleTime: { seconds: Math.floor(timestamp / 1000) } // Delayed execution};
const response = await client.createTask({ parent, task });

import { CloudTasksClient } from '@google-cloud/tasks';
const client = new CloudTasksClient();const parent = client.queuePath(project, location, queue);
const task = {    httpRequest: {        httpMethod: 'POST',        url: 'https://example.com/process',        headers: { 'Content-Type': 'application/json' },        body: Buffer.from(JSON.stringify(payload))    },    scheduleTime: { seconds: Math.floor(timestamp / 1000) } // Delayed execution};
const response = await client.createTask({ parent, task });

Pub/Sub Systems

1-to-many message distribution:

typescript

// SNS with filter policies for smart routingconst publishParams = {  TopicArn: 'arn:aws:sns:us-east-1:123:my-topic',  Message: JSON.stringify(event),  MessageAttributes: {    eventType: { DataType: 'String', StringValue: 'ORDER_CREATED' },    priority: { DataType: 'Number', StringValue: '1' }  }};
// Subscription with filterconst subscriptionPolicy = {  eventType: ['ORDER_CREATED', 'ORDER_UPDATED'],  priority: [{ numeric: ['>', 0] }]};

// SNS with filter policies for smart routingconst publishParams = {  TopicArn: 'arn:aws:sns:us-east-1:123:my-topic',  Message: JSON.stringify(event),  MessageAttributes: {    eventType: { DataType: 'String', StringValue: 'ORDER_CREATED' },    priority: { DataType: 'Number', StringValue: '1' }  }};
// Subscription with filterconst subscriptionPolicy = {  eventType: ['ORDER_CREATED', 'ORDER_UPDATED'],  priority: [{ numeric: ['>', 0] }]};

SNS + SQS Pattern (Fanout):

Delivery guarantees:

At-least-once delivery
No message ordering
Retry with exponential backoff
DLQ support for failed deliveries

Azure Service Bus Topics

More sophisticated than SNS:

csharp

// Topic with multiple subscriptions and filtersvar adminClient = new ServiceBusAdministrationClient(connectionString);
// Create subscription with SQL filterawait adminClient.CreateSubscriptionAsync(    new CreateSubscriptionOptions(topicName, subscriptionName),    new CreateRuleOptions("OrderFilter",        new SqlRuleFilter("EventType = 'OrderCreated' AND Priority > 5")));

// Topic with multiple subscriptions and filtersvar adminClient = new ServiceBusAdministrationClient(connectionString);
// Create subscription with SQL filterawait adminClient.CreateSubscriptionAsync(    new CreateSubscriptionOptions(topicName, subscriptionName),    new CreateRuleOptions("OrderFilter",        new SqlRuleFilter("EventType = 'OrderCreated' AND Priority > 5")));

Advanced features:

SQL-like filtering rules
Message sessions for ordering
Duplicate detection
Dead-lettering with reason tracking

Google Cloud Pub/Sub

Global message distribution:

typescript

import { PubSub } from '@google-cloud/pubsub';
const pubsub = new PubSub();const topic = pubsub.topic(topicId);
// Publishing with ordering keyconst messageId = await topic.publishMessage({    data: Buffer.from(data),    orderingKey: 'user-123', // Ensures order per key    attributes: {        event_type: 'user_updated',        version: '2'    }});

import { PubSub } from '@google-cloud/pubsub';
const pubsub = new PubSub();const topic = pubsub.topic(topicId);
// Publishing with ordering keyconst messageId = await topic.publishMessage({    data: Buffer.from(data),    orderingKey: 'user-123', // Ensures order per key    attributes: {        event_type: 'user_updated',        version: '2'    }});

Event Routing Services

AWS EventBridge

Rule-based event routing:

typescript

// EventBridge with content-based routingconst rule = {  Name: 'OrderProcessingRule',  EventPattern: JSON.stringify({    source: ['order.service'],    'detail-type': ['Order Created'],    detail: {      amount: [{ numeric: ['>', 100] }],      country: ['US', 'UK', 'DE']    }  }),  Targets: [    {      Arn: lambdaArn,      RetryPolicy: {        MaximumRetryAttempts: 2,        MaximumEventAge: 3600      },      DeadLetterConfig: {        Arn: dlqArn      }    }  ]};

// EventBridge with content-based routingconst rule = {  Name: 'OrderProcessingRule',  EventPattern: JSON.stringify({    source: ['order.service'],    'detail-type': ['Order Created'],    detail: {      amount: [{ numeric: ['>', 100] }],      country: ['US', 'UK', 'DE']    }  }),  Targets: [    {      Arn: lambdaArn,      RetryPolicy: {        MaximumRetryAttempts: 2,        MaximumEventAge: 3600      },      DeadLetterConfig: {        Arn: dlqArn      }    }  ]};

Cross-account event sharing:

typescript

// EventBridge event bus sharingconst eventBusPolicy = {  StatementId: 'AllowAccountAccess',  Action: 'events:PutEvents',  Principal: '987654321098',  // Target account  Condition: {    StringEquals: {      'events:detail-type': 'Order Created'    }  }};

// EventBridge event bus sharingconst eventBusPolicy = {  StatementId: 'AllowAccountAccess',  Action: 'events:PutEvents',  Principal: '987654321098',  // Target account  Condition: {    StringEquals: {      'events:detail-type': 'Order Created'    }  }};

Azure Event Grid

Azure's equivalent with powerful filtering:

json

{  "filter": {    "includedEventTypes": ["Microsoft.Storage.BlobCreated"],    "subjectBeginsWith": "/blobServices/default/containers/images/",    "advancedFilters": [      {        "operatorType": "NumberGreaterThan",        "key": "data.contentLength",        "value": 1048576      }    ]  }}

{  "filter": {    "includedEventTypes": ["Microsoft.Storage.BlobCreated"],    "subjectBeginsWith": "/blobServices/default/containers/images/",    "advancedFilters": [      {        "operatorType": "NumberGreaterThan",        "key": "data.contentLength",        "value": 1048576      }    ]  }}

Google Cloud Eventarc

GCP's unified eventing:

yaml

# Eventarc trigger configurationapiVersion: eventarc.cnrm.cloud.google.com/v1beta1kind: EventarcTriggermetadata:  name: storage-triggerspec:  location: us-central1  matchingCriteria:  - attribute: type    value: google.cloud.storage.object.v1.finalized  - attribute: bucket    value: my-bucket  destination:    cloudRunService:      name: process-image      region: us-central1

# Eventarc trigger configurationapiVersion: eventarc.cnrm.cloud.google.com/v1beta1kind: EventarcTriggermetadata:  name: storage-triggerspec:  location: us-central1  matchingCriteria:  - attribute: type    value: google.cloud.storage.object.v1.finalized  - attribute: bucket    value: my-bucket  destination:    cloudRunService:      name: process-image      region: us-central1

Stream Processing Platforms

Apache Kafka

The heavyweight champion of event streaming:

java

// Kafka Streams for real-time processingProperties props = new Properties();props.put(StreamsConfig.APPLICATION_ID_CONFIG, "order-processor");props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, "exactly_once_v2");props.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 3);
KStream<String, Order> orders = builder.stream("orders");KTable<String, Long> orderCounts = orders    .filter((k, v) -> v.getAmount() > 100)    .groupByKey()    .count(Materialized.as("order-counts-store"));
// DLQ handling with Kafka Streamsorders.foreach((key, value) -> {    try {        processOrder(value);    } catch (Exception e) {        producer.send(new ProducerRecord<>("orders-dlq", key, value));    }});

// Kafka Streams for real-time processingProperties props = new Properties();props.put(StreamsConfig.APPLICATION_ID_CONFIG, "order-processor");props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, "exactly_once_v2");props.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 3);
KStream<String, Order> orders = builder.stream("orders");KTable<String, Long> orderCounts = orders    .filter((k, v) -> v.getAmount() > 100)    .groupByKey()    .count(Materialized.as("order-counts-store"));
// DLQ handling with Kafka Streamsorders.foreach((key, value) -> {    try {        processOrder(value);    } catch (Exception e) {        producer.send(new ProducerRecord<>("orders-dlq", key, value));    }});

Kafka delivery semantics:

At-most-once: Fire and forget (acks=0)
At-least-once: Default (acks=1 or all)
Exactly-once: With transactions (enable.idempotence=true)

Cloud Streaming Equivalents

AWS Kinesis Data Streams

typescript

// Kinesis with KCL for processingconst kinesisClient = new AWS.Kinesis();
// Enhanced fan-out for low latencyconst consumer = await kinesisClient.registerStreamConsumer({  StreamARN: streamArn,  ConsumerName: 'low-latency-consumer'}).promise();
// Kinesis Data Analytics for SQL processingconst sqlQuery = `CREATE STREAM orders_above_100 ASSELECT * FROM SOURCE_SQL_STREAM_001WHERE orderAmount > 100;`;

// Kinesis with KCL for processingconst kinesisClient = new AWS.Kinesis();
// Enhanced fan-out for low latencyconst consumer = await kinesisClient.registerStreamConsumer({  StreamARN: streamArn,  ConsumerName: 'low-latency-consumer'}).promise();
// Kinesis Data Analytics for SQL processingconst sqlQuery = `CREATE STREAM orders_above_100 ASSELECT * FROM SOURCE_SQL_STREAM_001WHERE orderAmount > 100;`;

Azure Event Hubs

csharp

// Event Hubs with Kafka protocolvar config = new ConsumerConfig{    BootstrapServers = "namespace.servicebus.windows.net:9093",    SecurityProtocol = SecurityProtocol.SaslSsl,    SaslMechanism = SaslMechanism.Plain,    GroupId = "consumer-group"};
// Capture to Data Lake for long-term storagevar captureDescription = new CaptureDescription{    Enabled = true,    IntervalInSeconds = 300,    SizeLimitInBytes = 314572800,    Destination = new Destination    {        StorageAccountResourceId = "/subscriptions/.../storageAccounts/...",        BlobContainer = "capture"    }};

// Event Hubs with Kafka protocolvar config = new ConsumerConfig{    BootstrapServers = "namespace.servicebus.windows.net:9093",    SecurityProtocol = SecurityProtocol.SaslSsl,    SaslMechanism = SaslMechanism.Plain,    GroupId = "consumer-group"};
// Capture to Data Lake for long-term storagevar captureDescription = new CaptureDescription{    Enabled = true,    IntervalInSeconds = 300,    SizeLimitInBytes = 314572800,    Destination = new Destination    {        StorageAccountResourceId = "/subscriptions/.../storageAccounts/...",        BlobContainer = "capture"    }};

Google Cloud Dataflow

typescript

// Dataflow for stream processingimport { Pipeline } from '@google-cloud/dataflow';
const pipeline = new Pipeline(options);
pipeline    .readFromPubSub({ topic })    .window({ type: 'fixed', duration: 60 })    .map((data: string) => JSON.parse(data))    .filter((item: any) => item.amount > 100)    .writeToBigQuery({ tableSpec });

// Dataflow for stream processingimport { Pipeline } from '@google-cloud/dataflow';
const pipeline = new Pipeline(options);
pipeline    .readFromPubSub({ topic })    .window({ type: 'fixed', duration: 60 })    .map((data: string) => JSON.parse(data))    .filter((item: any) => item.amount > 100)    .writeToBigQuery({ tableSpec });

Dead Letter Queue (DLQ) Essentials

Dead Letter Queues are critical for production resilience. They handle messages that can't be processed successfully after retries.

Key DLQ concepts:

Safety net for failed messages
Prevents poison pill scenarios
Enables error analysis and recovery
Essential monitoring beyond queue depth

Basic DLQ pattern:

typescript

// Simple DLQ implementationconst dlqParams = {  QueueName: 'my-queue-dlq',  Attributes: {    MessageRetentionPeriod: '1209600',  // 14 days    RedrivePolicy: JSON.stringify({      deadLetterTargetArn: dlqArn,      maxReceiveCount: 3  // Retry 3 times before DLQ    })  }};

// Simple DLQ implementationconst dlqParams = {  QueueName: 'my-queue-dlq',  Attributes: {    MessageRetentionPeriod: '1209600',  // 14 days    RedrivePolicy: JSON.stringify({      deadLetterTargetArn: dlqArn,      maxReceiveCount: 3  // Retry 3 times before DLQ    })  }};

Deep Dive: For comprehensive DLQ strategies, monitoring patterns, circuit breakers, ML-based recovery, and production lessons, see our detailed guide: Dead Letter Queue Production Strategies

Edge and Hybrid Deployments

Edge Computing Considerations

Event-driven systems at the edge have unique constraints:

typescript

// Edge-optimized event processingclass EdgeEventProcessor {  private localQueue: Queue[] = [];  private cloudBuffer: Message[] = [];
  async processEvent(event: Event) {    // Process locally first    const processed = await this.localProcess(event);
    // Batch for cloud sync    if (this.shouldSyncToCloud(processed)) {      this.cloudBuffer.push(processed);
      if (this.cloudBuffer.length >= 100 ||          Date.now() - this.lastSync > 60000) {        await this.syncToCloud();      }    }  }
  private async syncToCloud() {    try {      // Compress and batch send      const compressed = this.compress(this.cloudBuffer);      await this.cloudClient.sendBatch(compressed);      this.cloudBuffer = [];      this.lastSync = Date.now();    } catch (error) {      // Store locally if cloud unreachable      await this.localStorage.store(this.cloudBuffer);    }  }}

// Edge-optimized event processingclass EdgeEventProcessor {  private localQueue: Queue[] = [];  private cloudBuffer: Message[] = [];
  async processEvent(event: Event) {    // Process locally first    const processed = await this.localProcess(event);
    // Batch for cloud sync    if (this.shouldSyncToCloud(processed)) {      this.cloudBuffer.push(processed);
      if (this.cloudBuffer.length >= 100 ||          Date.now() - this.lastSync > 60000) {        await this.syncToCloud();      }    }  }
  private async syncToCloud() {    try {      // Compress and batch send      const compressed = this.compress(this.cloudBuffer);      await this.cloudClient.sendBatch(compressed);      this.cloudBuffer = [];      this.lastSync = Date.now();    } catch (error) {      // Store locally if cloud unreachable      await this.localStorage.store(this.cloudBuffer);    }  }}

Cloudflare Workers with Queues

typescript

// Cloudflare Workers Queue Handlerexport default {  async queue(batch: MessageBatch, env: Env): Promise<void> {    for (const message of batch.messages) {      try {        // Process at edge        const result = await processMessage(message.body);
        // Store in Durable Objects or KV        await env.KV.put(          `processed:${message.id}`,          JSON.stringify(result),          { expirationTtl: 3600 }        );
        message.ack();      } catch (error) {        // Retry with backoff        message.retry({ delaySeconds: 30 });      }    }  }};

// Cloudflare Workers Queue Handlerexport default {  async queue(batch: MessageBatch, env: Env): Promise<void> {    for (const message of batch.messages) {      try {        // Process at edge        const result = await processMessage(message.body);
        // Store in Durable Objects or KV        await env.KV.put(          `processed:${message.id}`,          JSON.stringify(result),          { expirationTtl: 3600 }        );
        message.ack();      } catch (error) {        // Retry with backoff        message.retry({ delaySeconds: 30 });      }    }  }};

AWS IoT Core for Edge Events

typescript

// AWS IoT Core with Greengrassimport { GreengrassCoreIPCClient } from 'aws-iot-device-sdk-v2';import { PublishToTopicRequest, PublishMessage, BinaryMessage } from 'aws-iot-device-sdk-v2';
class EdgeIoTProcessor {    private ipcClient: GreengrassCoreIPCClient;
    constructor() {        this.ipcClient = new GreengrassCoreIPCClient();    }
    async publishEdgeEvent(event: any): Promise<void> {        // Local processing        const processed = this.processLocally(event);
        // Publish to IoT Core        const request: PublishToTopicRequest = {            topic: `edge/${this.deviceId}/events`,            publishMessage: {                binaryMessage: {                    message: Buffer.from(JSON.stringify(processed))                }            }        };
        await this.ipcClient.publishToTopic(request);    }}

// AWS IoT Core with Greengrassimport { GreengrassCoreIPCClient } from 'aws-iot-device-sdk-v2';import { PublishToTopicRequest, PublishMessage, BinaryMessage } from 'aws-iot-device-sdk-v2';
class EdgeIoTProcessor {    private ipcClient: GreengrassCoreIPCClient;
    constructor() {        this.ipcClient = new GreengrassCoreIPCClient();    }
    async publishEdgeEvent(event: any): Promise<void> {        // Local processing        const processed = this.processLocally(event);
        // Publish to IoT Core        const request: PublishToTopicRequest = {            topic: `edge/${this.deviceId}/events`,            publishMessage: {                binaryMessage: {                    message: Buffer.from(JSON.stringify(processed))                }            }        };
        await this.ipcClient.publishToTopic(request);    }}

Cross-Cloud Equivalents

Service Mapping Table

AWS	Azure	GCP	Use Case
SQS	Service Bus Queues	Cloud Tasks	Simple queuing
SNS	Service Bus Topics	Cloud Pub/Sub	Pub/Sub messaging
EventBridge	Event Grid	Eventarc	Event routing
Kinesis	Event Hubs	Pub/Sub + Dataflow	Stream processing
Lambda + SQS	Functions + Service Bus	Cloud Run + Pub/Sub	Serverless events
DynamoDB Streams	Cosmos DB Change Feed	Firestore Triggers	Database events
Step Functions	Logic Apps	Workflows	Event orchestration
MSK (Kafka)	Event Hubs (Kafka mode)	Confluent Cloud	Kafka-compatible

Multi-Cloud Event Bridge Pattern

typescript

// Abstract multi-cloud event interfaceinterface CloudEventAdapter {  publish(event: CloudEvent): Promise<void>;  subscribe(handler: EventHandler): Promise<void>;}
class MultiCloudEventBridge {  private adapters: Map<string, CloudEventAdapter> = new Map();
  constructor() {    this.adapters.set('aws', new AWSEventBridgeAdapter());    this.adapters.set('azure', new AzureEventGridAdapter());    this.adapters.set('gcp', new GCPEventarcAdapter());  }
  async publishToAll(event: CloudEvent) {    const promises = Array.from(this.adapters.values())      .map(adapter => adapter.publish(event));
    const results = await Promise.allSettled(promises);
    // Handle partial failures    const failures = results.filter(r => r.status === 'rejected');    if (failures.length > 0) {      await this.handleFailures(failures, event);    }  }}

// Abstract multi-cloud event interfaceinterface CloudEventAdapter {  publish(event: CloudEvent): Promise<void>;  subscribe(handler: EventHandler): Promise<void>;}
class MultiCloudEventBridge {  private adapters: Map<string, CloudEventAdapter> = new Map();
  constructor() {    this.adapters.set('aws', new AWSEventBridgeAdapter());    this.adapters.set('azure', new AzureEventGridAdapter());    this.adapters.set('gcp', new GCPEventarcAdapter());  }
  async publishToAll(event: CloudEvent) {    const promises = Array.from(this.adapters.values())      .map(adapter => adapter.publish(event));
    const results = await Promise.allSettled(promises);
    // Handle partial failures    const failures = results.filter(r => r.status === 'rejected');    if (failures.length > 0) {      await this.handleFailures(failures, event);    }  }}

Performance Comparison Matrix

Tool	Throughput	Latency	Message Size	Ordering	Delivery Guarantee	DLQ Support
SQS Standard	3K/sec batch	10-100ms	1MB	No	At-least-once	Yes
SQS FIFO	300/sec	10-100ms	1MB	Yes	Exactly-once	Yes
SNS	100K/sec	100-500ms	256KB	No	At-least-once	Yes
Kafka	1M+/sec	<10ms	1MB default	Per partition	Configurable	Manual
RabbitMQ	50K/sec	1-5ms	128MB	Optional	At-least-once	Yes
EventBridge	10K/sec	500ms-2s	256KB	No	At-least-once	Yes
Kinesis	1MB/sec/shard	200ms	1MB	Per shard	At-least-once	Manual
Azure Service Bus	2K/sec	10-50ms	256KB/100MB	Yes	At-least-once	Yes
Cloud Pub/Sub	100MB/sec	100ms	10MB	Per key	At-least-once	Yes
Redis Streams	100K/sec	<1ms	512MB	Yes	At-least-once	Manual

Decision Framework

Quick Decision Tree

When to Use What

Use Simple Queues (SQS/Service Bus) when:

Decoupling services
Work distribution
Simple retry requirements
Serverless processing

Use Pub/Sub (SNS/Topics) when:

Broadcasting events
Fan-out patterns
Multiple consumers
Notification systems

Use Event Routers (EventBridge/EventGrid) when:

Complex routing rules
Multi-service orchestration
SaaS integrations
Event-driven automation

Use Streaming (Kafka/Kinesis) when:

Real-time analytics
Event sourcing
High throughput (>100K/sec)
Event replay needed

Common Pitfalls and Solutions

Pitfall 1: Message Size Limits

typescript

// Solution: Claim check patternclass LargeMessageHandler {  async send(largePayload: any) {    if (JSON.stringify(largePayload).length > 256000) {      // Store in S3      const s3Key = await this.uploadToS3(largePayload);
      // Send reference      return this.queue.send({        type: 'large_message',        s3Key,        size: largePayload.length      });    }
    return this.queue.send(largePayload);  }}

// Solution: Claim check patternclass LargeMessageHandler {  async send(largePayload: any) {    if (JSON.stringify(largePayload).length > 256000) {      // Store in S3      const s3Key = await this.uploadToS3(largePayload);
      // Send reference      return this.queue.send({        type: 'large_message',        s3Key,        size: largePayload.length      });    }
    return this.queue.send(largePayload);  }}

Pitfall 2: Poison Messages

typescript

// Solution: Poison message detectionclass PoisonMessageDetector {  private messageAttempts = new Map<string, number>();
  async process(message: Message) {    const messageId = message.id;    const attempts = this.messageAttempts.get(messageId) || 0;
    if (attempts >= 3) {      // Identified as poison message      await this.quarantine(message);      return;    }
    try {      await this.processMessage(message);      this.messageAttempts.delete(messageId);    } catch (error) {      this.messageAttempts.set(messageId, attempts + 1);
      // Check if specific error pattern      if (this.isPoisonPattern(error)) {        await this.quarantine(message);      } else {        throw error; // Retry      }    }  }}

// Solution: Poison message detectionclass PoisonMessageDetector {  private messageAttempts = new Map<string, number>();
  async process(message: Message) {    const messageId = message.id;    const attempts = this.messageAttempts.get(messageId) || 0;
    if (attempts >= 3) {      // Identified as poison message      await this.quarantine(message);      return;    }
    try {      await this.processMessage(message);      this.messageAttempts.delete(messageId);    } catch (error) {      this.messageAttempts.set(messageId, attempts + 1);
      // Check if specific error pattern      if (this.isPoisonPattern(error)) {        await this.quarantine(message);      } else {        throw error; // Retry      }    }  }}

Pitfall 3: Ordering Guarantees

typescript

// Solution: Partition key strategyclass OrderedEventProcessor {  async publishOrdered(events: Event[]) {    // Group by entity ID for ordering    const grouped = this.groupBy(events, e => e.entityId);
    for (const [entityId, entityEvents] of grouped) {      // Sort by timestamp      entityEvents.sort((a, b) => a.timestamp - b.timestamp);
      // Send with same partition key      for (const event of entityEvents) {        await this.kafka.send({          topic: 'events',          key: entityId,  // Ensures ordering          value: event        });      }    }  }}

// Solution: Partition key strategyclass OrderedEventProcessor {  async publishOrdered(events: Event[]) {    // Group by entity ID for ordering    const grouped = this.groupBy(events, e => e.entityId);
    for (const [entityId, entityEvents] of grouped) {      // Sort by timestamp      entityEvents.sort((a, b) => a.timestamp - b.timestamp);
      // Send with same partition key      for (const event of entityEvents) {        await this.kafka.send({          topic: 'events',          key: entityId,  // Ensures ordering          value: event        });      }    }  }}

Monitoring and Observability

Key Metrics to Track

typescript

// Comprehensive metrics collectionclass EventMetrics {  private metrics = {    messagesPublished: new Counter('messages_published_total'),    messagesConsumed: new Counter('messages_consumed_total'),    messagesFailed: new Counter('messages_failed_total'),    processingDuration: new Histogram('message_processing_duration_seconds'),    queueDepth: new Gauge('queue_depth'),    consumerLag: new Gauge('consumer_lag'),    dlqDepth: new Gauge('dlq_depth')  };
  async recordProcessing(message: Message, processor: Function) {    const timer = this.metrics.processingDuration.startTimer();
    try {      const result = await processor(message);      this.metrics.messagesConsumed.inc();      return result;    } catch (error) {      this.metrics.messagesFailed.inc({        error_type: error.constructor.name,        queue: message.source      });      throw error;    } finally {      timer();    }  }}

// Comprehensive metrics collectionclass EventMetrics {  private metrics = {    messagesPublished: new Counter('messages_published_total'),    messagesConsumed: new Counter('messages_consumed_total'),    messagesFailed: new Counter('messages_failed_total'),    processingDuration: new Histogram('message_processing_duration_seconds'),    queueDepth: new Gauge('queue_depth'),    consumerLag: new Gauge('consumer_lag'),    dlqDepth: new Gauge('dlq_depth')  };
  async recordProcessing(message: Message, processor: Function) {    const timer = this.metrics.processingDuration.startTimer();
    try {      const result = await processor(message);      this.metrics.messagesConsumed.inc();      return result;    } catch (error) {      this.metrics.messagesFailed.inc({        error_type: error.constructor.name,        queue: message.source      });      throw error;    } finally {      timer();    }  }}

Conclusion

The event-driven landscape is vast, but the key is understanding:

Message patterns determine tool choice
Delivery guarantees affect architecture
DLQ strategies separate production systems from toys
Cloud equivalents exist for most patterns
Edge requirements need special consideration

Start simple, measure everything, and evolve based on actual requirements rather than anticipated ones. Most importantly, design for failure - because messages will fail, services will go down, and poison messages will appear.

The best architecture is one that can evolve with your needs while maintaining reliability and observability.

Related Deep Dives:

Dead Letter Queue Production Strategies - Comprehensive DLQ patterns and monitoring

Message Patterns: The Foundation#

1-to-1 (Queue Pattern)#

1-to-Many (Topic/Fan-out Pattern)#

Many-to-Many (Event Mesh)#

The Complete Tool Landscape#

Simple Queue Services#

AWS SQS (Simple Queue Service)#

Azure Service Bus Queues#

Google Cloud Tasks#

Pub/Sub Systems#

AWS SNS (Simple Notification Service)#

Azure Service Bus Topics#

Google Cloud Pub/Sub#

Event Routing Services#

AWS EventBridge#

Azure Event Grid#

Google Cloud Eventarc#

Stream Processing Platforms#

Apache Kafka#

Cloud Streaming Equivalents#

AWS Kinesis Data Streams#

Azure Event Hubs#

Google Cloud Dataflow#

Dead Letter Queue (DLQ) Essentials#

Edge and Hybrid Deployments#

Edge Computing Considerations#

Cloudflare Workers with Queues#

AWS IoT Core for Edge Events#

Cross-Cloud Equivalents#

Service Mapping Table#

Multi-Cloud Event Bridge Pattern#

Performance Comparison Matrix#

Decision Framework#

Quick Decision Tree#

When to Use What#

Common Pitfalls and Solutions#

Pitfall 1: Message Size Limits#

Pitfall 2: Poison Messages#

Pitfall 3: Ordering Guarantees#

Monitoring and Observability#

Key Metrics to Track#

Conclusion#

Related Posts

Message Patterns: The Foundation

1-to-1 (Queue Pattern)

1-to-Many (Topic/Fan-out Pattern)

Many-to-Many (Event Mesh)

The Complete Tool Landscape

Simple Queue Services

AWS SQS (Simple Queue Service)

Azure Service Bus Queues

Google Cloud Tasks

Pub/Sub Systems

AWS SNS (Simple Notification Service)

Azure Service Bus Topics

Google Cloud Pub/Sub

Event Routing Services

AWS EventBridge

Azure Event Grid

Google Cloud Eventarc

Stream Processing Platforms

Apache Kafka

Cloud Streaming Equivalents

AWS Kinesis Data Streams

Azure Event Hubs

Google Cloud Dataflow

Dead Letter Queue (DLQ) Essentials

Edge and Hybrid Deployments

Edge Computing Considerations

Cloudflare Workers with Queues

AWS IoT Core for Edge Events

Cross-Cloud Equivalents

Service Mapping Table

Multi-Cloud Event Bridge Pattern

Performance Comparison Matrix

Decision Framework

Quick Decision Tree

When to Use What

Common Pitfalls and Solutions

Pitfall 1: Message Size Limits

Pitfall 2: Poison Messages

Pitfall 3: Ordering Guarantees

Monitoring and Observability

Key Metrics to Track

Conclusion