Real-time Notifications and Multi-Channel Delivery: WebSockets, Push, Email, and Beyond
Implementation strategies for real-time notification delivery across WebSocket, push notification, email, SMS, and webhook channels with production-tested patterns
Remember that moment when your product manager asked for "real-time notifications" and you thought "how hard could it be?" Six months later, you're debugging why push notifications work on iOS but not Android, dealing with WebSocket connection storms, and explaining to leadership why the email vendor bill tripled overnight.
After implementing multi-channel notification systems that deliver millions of messages daily, I've learned that the real challenge isn't sending notifications - it's doing it reliably, at scale, across different delivery mechanisms that each have their own quirks, limitations, and failure modes.
Let me share the battle-tested patterns for WebSocket connections, push notifications, email delivery, SMS, and webhooks that have survived everything from DDoS attacks to vendor outages.
WebSocket Management: The Foundation of Real-Time#
WebSockets seem straightforward until you hit production with thousands of concurrent connections. Here's what I've learned about building WebSocket infrastructure that actually scales:
Connection Management That Works#
The biggest lesson: connections are ephemeral, but user state isn't. Here's the connection manager that's survived multiple product launches:
interface ConnectionMetadata {
userId: string;
deviceId?: string;
userAgent: string;
connectedAt: Date;
lastPing: Date;
subscriptions: Set<string>;
metadata: Record<string, any>;
}
class WebSocketConnectionManager {
private connections: Map<string, {
socket: WebSocket;
metadata: ConnectionMetadata;
}> = new Map();
private userConnections: Map<string, Set<string>> = new Map();
private redis: Redis;
private heartbeatInterval: NodeJS.Timeout;
constructor(redis: Redis) {
this.redis = redis;
this.startHeartbeat();
}
async handleConnection(socket: WebSocket, request: IncomingMessage): Promise<void> {
const connectionId = this.generateConnectionId();
try {
// Extract user info from JWT token or session
const userInfo = await this.authenticateConnection(request);
if (!userInfo) {
socket.close(1008, 'Authentication required');
return;
}
const metadata: ConnectionMetadata = {
userId: userInfo.userId,
deviceId: userInfo.deviceId,
userAgent: request.headers['user-agent'] || '',
connectedAt: new Date(),
lastPing: new Date(),
subscriptions: new Set(),
metadata: {}
};
// Store connection
this.connections.set(connectionId, { socket, metadata });
// Update user connection mapping
if (!this.userConnections.has(userInfo.userId)) {
this.userConnections.set(userInfo.userId, new Set());
}
this.userConnections.get(userInfo.userId)!.add(connectionId);
// Store connection info in Redis for multi-instance support
await this.redis.hset(
`ws:connections:${userInfo.userId}`,
connectionId,
JSON.stringify({
serverId: process.env.SERVER_ID,
connectedAt: metadata.connectedAt,
deviceId: metadata.deviceId
})
);
// Setup event handlers
this.setupConnectionHandlers(connectionId, socket, metadata);
// Send connection acknowledgment
await this.sendMessage(connectionId, {
type: 'connection_ack',
data: { connectionId, timestamp: new Date() }
});
console.log(`WebSocket connection established for user ${userInfo.userId}`);
} catch (error) {
console.error('WebSocket connection setup failed:', error);
socket.close(1011, 'Internal server error');
}
}
private setupConnectionHandlers(
connectionId: string,
socket: WebSocket,
metadata: ConnectionMetadata
): void {
socket.on('message', async (data) => {
try {
const message = JSON.parse(data.toString());
await this.handleMessage(connectionId, message);
} catch (error) {
console.error('Message handling error:', error);
await this.sendError(connectionId, 'Invalid message format');
}
});
socket.on('pong', () => {
metadata.lastPing = new Date();
});
socket.on('close', async (code, reason) => {
await this.handleDisconnection(connectionId, code, reason);
});
socket.on('error', async (error) => {
console.error(`WebSocket error for ${connectionId}:`, error);
await this.handleDisconnection(connectionId, 1011, 'Connection error');
});
}
async sendNotificationToUser(userId: string, notification: NotificationEvent): Promise<void> {
const userConnectionIds = this.userConnections.get(userId) || new Set();
if (userConnectionIds.size === 0) {
// User not connected to this server instance
// Check Redis for other server instances
const remoteConnections = await this.redis.hgetall(`ws:connections:${userId}`);
if (Object.keys(remoteConnections).length > 0) {
// User connected to another server instance
await this.redis.publish('ws:notification', JSON.stringify({
userId,
notification,
targetServerId: null // broadcast to all servers
}));
}
return;
}
// Send to local connections
const sendPromises = Array.from(userConnectionIds).map(async (connectionId) => {
try {
await this.sendMessage(connectionId, {
type: 'notification',
data: notification
});
return { connectionId, success: true };
} catch (error) {
console.error(`Failed to send notification to ${connectionId}:`, error);
return { connectionId, success: false, error };
}
});
const results = await Promise.allSettled(sendPromises);
// Clean up failed connections
results.forEach((result, index) => {
if (result.status === 'rejected' ||
(result.status === 'fulfilled' && !result.value.success)) {
const connectionId = Array.from(userConnectionIds)[index];
this.handleDisconnection(connectionId, 1011, 'Send failed');
}
});
}
}
WebSocket Scaling Patterns#
The hard lesson about WebSockets: they don't scale the same way REST APIs do. Here's the multi-instance coordination pattern that's worked across different deployment scenarios:
class WebSocketCluster {
constructor(
private connectionManager: WebSocketConnectionManager,
private redis: Redis
) {
this.setupClusterCommunication();
}
private setupClusterCommunication(): void {
// Listen for notifications that need to be delivered
this.redis.subscribe('ws:notification');
this.redis.subscribe('ws:broadcast');
this.redis.on('message', async (channel, message) => {
try {
const data = JSON.parse(message);
if (channel === 'ws:notification') {
await this.handleRemoteNotification(data);
} else if (channel === 'ws:broadcast') {
await this.handleBroadcast(data);
}
} catch (error) {
console.error('Cluster message handling error:', error);
}
});
}
private async handleRemoteNotification(data: {
userId: string;
notification: NotificationEvent;
targetServerId?: string;
}): Promise<void> {
// Only process if no target server specified or we're the target
if (data.targetServerId && data.targetServerId !== process.env.SERVER_ID) {
return;
}
await this.connectionManager.sendNotificationToUser(
data.userId,
data.notification
);
}
async broadcastSystemMessage(message: any): Promise<void> {
await this.redis.publish('ws:broadcast', JSON.stringify({
message,
senderId: process.env.SERVER_ID,
timestamp: new Date()
}));
}
}
Push Notifications: Mobile's Double-Edged Sword#
Push notifications look simple in documentation but become complex fast when you need to handle multiple platforms, user permissions, and delivery guarantees. Here's what production taught me:
Multi-Platform Push Service#
The key insight: treat iOS and Android as completely different beasts, even though they're both "push notifications":
interface PushProvider {
sendNotification(
tokens: string[],
payload: PushPayload,
options?: PushOptions
): Promise<PushResult[]>;
validateToken(token: string): Promise<boolean>;
getInvalidTokens(results: PushResult[]): string[];
}
interface PushPayload {
title: string;
body: string;
data?: Record<string, any>;
badge?: number;
sound?: string;
icon?: string;
image?: string;
}
class UnifiedPushService {
private providers: Map<PushPlatform, PushProvider> = new Map();
private tokenStore: TokenStore;
private analytics: PushAnalytics;
constructor() {
this.providers.set('ios', new APNSProvider());
this.providers.set('android', new FCMProvider());
this.providers.set('web', new WebPushProvider());
}
async sendPushNotification(
userId: string,
notification: NotificationEvent
): Promise<PushDeliveryResult> {
try {
// Get all push tokens for user
const userTokens = await this.tokenStore.getUserTokens(userId);
if (userTokens.length === 0) {
return {
success: false,
reason: 'no_tokens',
deliveries: []
};
}
// Group tokens by platform
const tokensByPlatform = this.groupTokensByPlatform(userTokens);
// Prepare platform-specific payloads
const payloads = await this.createPlatformPayloads(notification);
// Send to each platform
const deliveryPromises = Object.entries(tokensByPlatform).map(
([platform, tokens]) => this.sendToPlatform(
platform as PushPlatform,
tokens,
payloads[platform as PushPlatform],
notification
)
);
const results = await Promise.allSettled(deliveryPromises);
// Process results and clean up invalid tokens
const deliveries = await this.processDeliveryResults(results, userTokens);
// Track analytics
await this.analytics.trackPushDelivery(notification.id, deliveries);
return {
success: deliveries.some(d => d.success),
deliveries
};
} catch (error) {
console.error('Push notification delivery failed:', error);
return {
success: false,
reason: 'send_error',
error: error.message,
deliveries: []
};
}
}
private async sendToPlatform(
platform: PushPlatform,
tokens: PushToken[],
payload: PushPayload,
notification: NotificationEvent
): Promise<PlatformDeliveryResult> {
const provider = this.providers.get(platform);
if (!provider) {
throw new Error(`No provider for platform ${platform}`);
}
// Platform-specific options
const options: PushOptions = {
priority: this.mapPriorityToPlatform(notification.priority, platform),
ttl: notification.expiresAt ?
Math.floor((notification.expiresAt.getTime() - Date.now()) / 1000) :
3600, // 1 hour default
collapseKey: platform === 'android' ? notification.type : undefined,
apnsTopic: platform === 'ios' ? process.env.APNS_TOPIC : undefined
};
const tokenStrings = tokens.map(t => t.token);
const results = await provider.sendNotification(tokenStrings, payload, options);
// Clean up invalid tokens
const invalidTokens = provider.getInvalidTokens(results);
if (invalidTokens.length > 0) {
await this.tokenStore.markTokensInvalid(userId, invalidTokens);
}
return {
platform,
tokens: tokenStrings,
results,
invalidTokens
};
}
private async createPlatformPayloads(
notification: NotificationEvent
): Promise<Record<PushPlatform, PushPayload>> {
// Get localized content based on user preferences
const template = await this.templateService.getTemplate(
notification.type,
'push',
'en' // Should be user's locale
);
const rendered = await this.templateService.render(template, notification.data);
return {
ios: {
title: rendered.subject || '',
body: rendered.body,
data: {
notificationId: notification.id,
type: notification.type,
...notification.data
},
badge: await this.getBadgeCount(notification.userId),
sound: this.getSoundForNotificationType(notification.type)
},
android: {
title: rendered.subject || '',
body: rendered.body,
data: {
notificationId: notification.id,
type: notification.type,
...notification.data
},
icon: 'ic_notification',
// Android-specific styling
color: '#007AFF'
},
web: {
title: rendered.subject || '',
body: rendered.body,
data: notification.data,
icon: '/icons/notification-icon.png',
image: notification.data.imageUrl
}
};
}
}
Push Token Management#
Token management is where most push implementations fail in production. Tokens become invalid, users uninstall apps, and you need to handle this gracefully:
class PushTokenStore {
constructor(private db: Database, private redis: Redis) {}
async registerToken(
userId: string,
token: string,
platform: PushPlatform,
deviceId: string
): Promise<void> {
try {
// Validate token format
if (!this.isValidTokenFormat(token, platform)) {
throw new Error('Invalid token format');
}
// Check if token already exists for another user
const existingToken = await this.db.query(
'SELECT user_id FROM push_tokens WHERE token = $1',
[token]
);
if (existingToken.length > 0 && existingToken[0].user_id !== userId) {
// Token moved to new user, update it
await this.db.query(
'UPDATE push_tokens SET user_id = $1, updated_at = NOW() WHERE token = $2',
[userId, token]
);
} else {
// Insert or update token
await this.db.query(`
INSERT INTO push_tokens (user_id, token, platform, device_id, is_active, created_at, updated_at)
VALUES ($1, $2, $3, $4, true, NOW(), NOW())
ON CONFLICT (token)
DO UPDATE SET
user_id = $1,
is_active = true,
updated_at = NOW()
`, [userId, token, platform, deviceId]);
}
// Cache active tokens for quick lookup
await this.redis.sadd(`push_tokens:${userId}`, token);
console.log(`Registered push token for user ${userId} on ${platform}`);
} catch (error) {
console.error('Push token registration failed:', error);
throw error;
}
}
async markTokensInvalid(userId: string, tokens: string[]): Promise<void> {
if (tokens.length === 0) return;
await this.db.query(
'UPDATE push_tokens SET is_active = false, updated_at = NOW() WHERE token = ANY($1)',
[tokens]
);
// Remove from Redis cache
if (tokens.length > 0) {
await this.redis.srem(`push_tokens:${userId}`, ...tokens);
}
console.log(`Marked ${tokens.length} tokens as invalid for user ${userId}`);
}
async getUserTokens(userId: string): Promise<PushToken[]> {
// Try cache first
const cachedTokens = await this.redis.smembers(`push_tokens:${userId}`);
if (cachedTokens.length > 0) {
// Get full token info from database
const tokens = await this.db.query(`
SELECT token, platform, device_id, created_at
FROM push_tokens
WHERE user_id = $1 AND is_active = true AND token = ANY($2)
`, [userId, cachedTokens]);
return tokens;
}
// Cache miss, get from database and populate cache
const tokens = await this.db.query(`
SELECT token, platform, device_id, created_at
FROM push_tokens
WHERE user_id = $1 AND is_active = true
ORDER BY updated_at DESC
`, [userId]);
if (tokens.length > 0) {
await this.redis.sadd(
`push_tokens:${userId}`,
...tokens.map(t => t.token)
);
await this.redis.expire(`push_tokens:${userId}`, 86400); // 24 hours
}
return tokens;
}
}
Email Delivery: More Complex Than You Think#
Email seems like the "easy" channel until you deal with deliverability, bounce handling, and vendor limits. Here's the email service that's handled millions of emails without ending up in spam folders:
Email Service with Provider Failover#
The production reality: email providers fail, get rate limited, or have deliverability issues. You need multiple providers and smart routing:
interface EmailProvider {
sendEmail(email: EmailMessage): Promise<EmailResult>;
handleWebhook(payload: any): Promise<WebhookResult>;
getDeliverabilityScore(): Promise<number>;
}
class EmailDeliveryService {
private providers: EmailProvider[] = [];
private primaryProvider: EmailProvider;
private fallbackProviders: EmailProvider[];
constructor() {
// Initialize providers in priority order
this.providers = [
new SendGridProvider(),
new AmazonSESProvider(),
new PostmarkProvider()
];
this.primaryProvider = this.providers[0];
this.fallbackProviders = this.providers.slice(1);
}
async sendEmail(
userId: string,
notification: NotificationEvent
): Promise<EmailDeliveryResult> {
try {
// Get user email and preferences
const user = await this.getUserWithEmailPrefs(userId);
if (!user.email || !user.emailEnabled) {
return {
success: false,
reason: 'email_disabled',
attempts: []
};
}
// Check if user is on suppression list
if (await this.isUserSuppressed(user.email)) {
return {
success: false,
reason: 'user_suppressed',
attempts: []
};
}
// Render email content
const emailContent = await this.renderEmailContent(notification, user);
// Prepare email message
const emailMessage: EmailMessage = {
to: user.email,
from: this.getFromAddress(notification.type),
subject: emailContent.subject,
html: emailContent.html,
text: emailContent.text,
metadata: {
userId,
notificationId: notification.id,
notificationType: notification.type
},
tags: [notification.type, `user:${userId}`],
unsubscribeUrl: this.generateUnsubscribeUrl(userId, notification.type)
};
// Try primary provider first
let result = await this.attemptDelivery(this.primaryProvider, emailMessage);
if (!result.success) {
// Try fallback providers
for (const provider of this.fallbackProviders) {
console.warn(`Primary email provider failed, trying fallback: ${provider.constructor.name}`);
result = await this.attemptDelivery(provider, emailMessage);
if (result.success) break;
}
}
// Store delivery result
await this.storeDeliveryResult(notification.id, 'email', result);
return {
success: result.success,
attempts: [result],
providerId: result.providerId,
messageId: result.messageId
};
} catch (error) {
console.error('Email delivery failed:', error);
return {
success: false,
reason: 'delivery_error',
error: error.message,
attempts: []
};
}
}
private async attemptDelivery(
provider: EmailProvider,
email: EmailMessage
): Promise<EmailAttemptResult> {
const startTime = Date.now();
try {
const result = await provider.sendEmail(email);
const duration = Date.now() - startTime;
return {
providerId: provider.constructor.name,
success: result.success,
messageId: result.messageId,
duration,
response: result.response
};
} catch (error) {
const duration = Date.now() - startTime;
return {
providerId: provider.constructor.name,
success: false,
duration,
error: error.message,
shouldRetry: this.isRetryableError(error)
};
}
}
private async renderEmailContent(
notification: NotificationEvent,
user: User
): Promise<EmailContent> {
// Get email template
const template = await this.templateService.getTemplate(
notification.type,
'email',
user.locale
);
// Render with user data and notification data
const context = {
user,
...notification.data,
unsubscribeUrl: this.generateUnsubscribeUrl(user.id, notification.type),
preferencesUrl: this.generatePreferencesUrl(user.id)
};
const rendered = await this.templateService.render(template, context);
// Convert markdown to HTML if needed
const html = this.markdownToHtml(rendered.body);
const text = this.htmlToText(html);
return {
subject: rendered.subject,
html,
text
};
}
}
Email Bounce and Complaint Handling#
The part nobody talks about: handling bounces, complaints, and unsubscribes properly is critical for deliverability:
class EmailWebhookHandler {
constructor(
private db: Database,
private suppressionService: SuppressionService
) {}
async handleWebhook(
provider: string,
payload: any
): Promise<WebhookProcessResult> {
try {
const events = this.parseProviderWebhook(provider, payload);
for (const event of events) {
await this.processEmailEvent(event);
}
return { success: true, eventsProcessed: events.length };
} catch (error) {
console.error('Webhook processing failed:', error);
return { success: false, error: error.message };
}
}
private async processEmailEvent(event: EmailEvent): Promise<void> {
// Update delivery record
await this.db.query(`
UPDATE notification_deliveries
SET status = $1, delivered_at = $2, error_message = $3, provider_response = $4
WHERE provider_id = $5
`, [
event.status,
event.timestamp,
event.error,
JSON.stringify(event.rawData),
event.messageId
]);
// Handle specific event types
switch (event.type) {
case 'bounce':
await this.handleBounce(event);
break;
case 'complaint':
await this.handleComplaint(event);
break;
case 'unsubscribe':
await this.handleUnsubscribe(event);
break;
case 'delivered':
await this.handleDelivery(event);
break;
}
}
private async handleBounce(event: EmailEvent): Promise<void> {
const bounceType = event.bounceType || 'unknown';
if (bounceType === 'permanent') {
// Suppress permanently bounced email
await this.suppressionService.addSuppression({
email: event.recipient,
reason: 'permanent_bounce',
source: 'webhook',
metadata: {
bounceSubType: event.bounceSubType,
messageId: event.messageId
}
});
console.log(`Permanently suppressed ${event.recipient} due to hard bounce`);
} else if (bounceType === 'temporary') {
// Track temporary bounces, suppress after threshold
const bounceCount = await this.incrementBounceCount(event.recipient);
if (bounceCount >= 5) {
await this.suppressionService.addSuppression({
email: event.recipient,
reason: 'repeated_soft_bounces',
source: 'auto_suppression',
metadata: { bounceCount }
});
console.log(`Auto-suppressed ${event.recipient} after ${bounceCount} soft bounces`);
}
}
}
private async handleComplaint(event: EmailEvent): Promise<void> {
// Immediately suppress users who mark emails as spam
await this.suppressionService.addSuppression({
email: event.recipient,
reason: 'spam_complaint',
source: 'webhook',
metadata: {
messageId: event.messageId,
complaintType: event.complaintSubType
}
});
// Also add to global suppression list
await this.db.query(`
UPDATE users SET email_enabled = false
WHERE email = $1
`, [event.recipient]);
console.log(`Suppressed ${event.recipient} due to spam complaint`);
}
}
SMS and Webhook Channels: The Supporting Cast#
SMS and webhooks round out the multi-channel approach. Here's how to implement them reliably:
SMS Delivery Service#
SMS is the "nuclear option" for critical notifications. Keep it simple and reliable:
class SMSDeliveryService {
private provider: SMSProvider;
private fallbackProvider: SMSProvider;
constructor() {
this.provider = new TwilioProvider();
this.fallbackProvider = new AmazonSNSProvider();
}
async sendSMS(
userId: string,
notification: NotificationEvent
): Promise<SMSDeliveryResult> {
try {
const user = await this.getUserWithSMSPrefs(userId);
if (!user.phone || !user.smsEnabled) {
return { success: false, reason: 'sms_disabled' };
}
// SMS content should be concise
const content = await this.renderSMSContent(notification, user);
// Try primary provider
let result = await this.provider.sendSMS({
to: user.phone,
message: content,
metadata: {
userId,
notificationId: notification.id
}
});
if (!result.success) {
// Try fallback
result = await this.fallbackProvider.sendSMS({
to: user.phone,
message: content,
metadata: { userId, notificationId: notification.id }
});
}
await this.storeDeliveryResult(notification.id, 'sms', result);
return result;
} catch (error) {
console.error('SMS delivery failed:', error);
return { success: false, error: error.message };
}
}
private async renderSMSContent(
notification: NotificationEvent,
user: User
): Promise<string> {
const template = await this.templateService.getTemplate(
notification.type,
'sms',
user.locale
);
const rendered = await this.templateService.render(template, {
user,
...notification.data
});
// SMS has character limits
return this.truncateForSMS(rendered.body, 160);
}
}
Webhook Delivery for Integrations#
Webhooks are how you integrate with external systems. Make them reliable and well-documented:
class WebhookDeliveryService {
private httpClient: HTTPClient;
private retryQueue: Queue;
constructor() {
this.httpClient = new HTTPClient({
timeout: 10000,
retries: 3,
retryDelay: 1000
});
}
async sendWebhook(
userId: string,
notification: NotificationEvent
): Promise<WebhookDeliveryResult> {
try {
// Get user's webhook configurations
const webhookConfigs = await this.getWebhookConfigs(userId, notification.type);
if (webhookConfigs.length === 0) {
return { success: false, reason: 'no_webhooks_configured' };
}
// Send to each configured webhook
const deliveryPromises = webhookConfigs.map(config =>
this.deliverToWebhook(config, notification)
);
const results = await Promise.allSettled(deliveryPromises);
const successful = results.filter(r =>
r.status === 'fulfilled' && r.value.success
).length;
return {
success: successful > 0,
delivered: successful,
total: webhookConfigs.length,
results: results.map(r =>
r.status === 'fulfilled' ? r.value : { success: false, error: r.reason }
)
};
} catch (error) {
console.error('Webhook delivery failed:', error);
return { success: false, error: error.message };
}
}
private async deliverToWebhook(
config: WebhookConfig,
notification: NotificationEvent
): Promise<WebhookResult> {
const payload = {
id: notification.id,
type: notification.type,
userId: notification.userId,
data: notification.data,
timestamp: notification.scheduledAt || new Date(),
signature: this.generateSignature(config.secret, notification)
};
try {
const response = await this.httpClient.post(config.url, payload, {
headers: {
'Content-Type': 'application/json',
'X-Webhook-Signature': payload.signature,
'User-Agent': 'NotificationSystem/1.0'
}
});
return {
success: response.status >= 200 && response.status <300,
statusCode: response.status,
response: response.data
};
} catch (error) {
return {
success: false,
error: error.message,
shouldRetry: this.isRetryableHttpError(error)
};
}
}
}
Channel Coordination and Delivery Logic#
The real complexity comes from coordinating across channels intelligently:
class MultiChannelDeliveryOrchestrator {
async deliverNotification(notification: NotificationEvent): Promise<void> {
// Get user preferences for this notification type
const preferences = await this.preferenceManager
.getEnabledChannels(notification.userId, notification.type);
if (preferences.length === 0) {
await this.analytics.trackSkipped(notification.id, 'no_enabled_channels');
return;
}
// Apply delivery rules based on notification priority and type
const deliveryPlan = this.createDeliveryPlan(notification, preferences);
// Execute delivery plan
const results = await this.executeDeliveryPlan(deliveryPlan);
// Track overall delivery success
await this.analytics.trackMultiChannelDelivery(notification.id, results);
}
private createDeliveryPlan(
notification: NotificationEvent,
enabledChannels: NotificationChannel[]
): DeliveryPlan {
const plan: DeliveryPlan = {
immediate: [],
delayed: [],
conditional: []
};
// Critical notifications go to all channels immediately
if (notification.priority === 'critical') {
plan.immediate = enabledChannels;
return plan;
}
// Normal notifications follow user preferences and smart rules
for (const channel of enabledChannels) {
if (channel === 'in_app' || channel === 'push') {
plan.immediate.push(channel);
} else if (channel === 'email') {
// Email can be delayed for batching
plan.delayed.push({
channel,
delay: this.getEmailBatchDelay(notification.userId)
});
} else {
plan.conditional.push({
channel,
condition: this.getDeliveryCondition(channel, notification)
});
}
}
return plan;
}
}
Lessons from the Delivery Trenches#
After debugging everything from WebSocket connection storms to email deliverability crises, here are the hard-won lessons:
-
Connections are ephemeral: Build your WebSocket infrastructure assuming connections will drop. Store critical state outside the connection.
-
Push tokens expire: Have a robust token management system that handles invalid tokens gracefully and re-registers tokens when needed.
-
Email deliverability is an art: Multiple providers, proper bounce handling, and suppression lists aren't optional - they're survival necessities.
-
Every channel has rate limits: Build your system to respect provider limits and implement intelligent backoff strategies.
-
Users change their minds: Make it easy to update preferences and handle opt-outs immediately. Your deliverability depends on it.
-
Monitor everything: Each channel needs specific monitoring. WebSocket connection counts, push delivery rates, email bounce rates, SMS costs - track them all.
In the next part of this series, we'll dive into the production war stories that taught me these lessons. We'll cover the debugging techniques and monitoring strategies that actually work when your notification system is melting down during a critical business moment.
The multi-channel delivery system we've built here handles the happy path well, but the real test comes when things go wrong. And in notification systems, something always goes wrong.
Building a Scalable User Notification System
A comprehensive 4-part series covering the design, implementation, and production challenges of building enterprise-grade notification systems. From architecture and database design to real-time delivery, debugging at scale, and performance optimization.
All Posts in This Series
Comments (0)
Join the conversation
Sign in to share your thoughts and engage with the community
No comments yet
Be the first to share your thoughts on this post!
Comments (0)
Join the conversation
Sign in to share your thoughts and engage with the community
No comments yet
Be the first to share your thoughts on this post!