Skip to content
dawalnut
Menu

Building an AI Call Center: Architecture

awscdkaiarchitecturetypescript

The Pitch

What if you could spin up a voice agent that answers phone calls, has a real conversation with the caller, and knows when to escalate? Not a press-1-for-sales IVR — an actual AI that listens, reasons, uses tools, and speaks back naturally.

That's what I set out to build. The result is a platform that wires together Amazon Connect, Amazon Bedrock, and a management dashboard — all deployed with CDK and TypeScript.

This is Part 1 of a three-part series. Here I'll cover the architecture and infrastructure design. Part 2 dives into the conversation engine, and Part 3 walks through the dashboard.

High-Level Architecture

The platform splits into four CDK stacks per stage:

StackRegionWhat it does
CertificateStackus-east-1DNS zone, NS delegation, ACM wildcard cert
AiStackeu-central-1DynamoDB, API Gateway, 4 Lambdas, EventBridge, Cognito
ConnectStackeu-central-1Connect instance, Lex bot, contact flow, queue
DashboardStackeu-central-1CloudFront + Lambda SSR for the management UI

The stacks have deliberate dependency ordering: ConnectStack depends on AiStack (it needs the AI Lambda ARN for the contact flow), and DashboardStack depends on both AiStack (for the API URL and Cognito config) and CertificateStack (for the certificate and zone).

The Call Path

When someone dials in, the call follows this path:

  1. PSTN → Amazon Connect — Connect handles the telephony (SIP, media, queueing)
  2. Connect → AI Conversation Lambda — the contact flow invokes a Lambda on each caller utterance
  3. Lambda → Amazon Bedrock — the Lambda sends conversation history to Claude for a response
  4. Lambda → EventBridge — each turn publishes an event for downstream consumers
  5. Bedrock → Lambda → Connect — the response text goes back to Connect, which speaks it via Polly

On disconnect, Connect fires a native EventBridge event that triggers the Analytics Lambda to finalize the call record.

Why a Single DynamoDB Table

I chose a single-table design because the access patterns are well-defined and the entities are related. Here's the key structure:

pkskEntity
TENANT#<id>AGENT#<id>Agent configuration
TENANT#<id>CONFIGTenant settings
TENANT#<id>WEBHOOK#<id>Webhook endpoint
CALL#<contactId>CONVERSATIONConversation history
CALL#<contactId>OUTCOME#<agentId>Call outcome
PHONE#<number>MAPPINGPhone number mapping

A GSI (gsi1pk / gsi1sk) supports time-ordered queries for call history:

gsi1pkgsi1skUse case
TENANT#<id>OUTCOMETIME#<iso>"Show me all calls for this tenant, newest first"

The beauty of this design is that every Lambda handler shares the same table and the same key-building functions. No cross-table joins, no fan-out reads.

Key Builder Pattern

Early on, every Lambda was littered with string concatenation:

const key = {
  pk: `${TENANT_PREFIX}${tenantId}`,
  sk: `${AGENT_PREFIX}${agentId}`,
};

This got repetitive fast. I extracted key builders into the shared package:

import { tenantPk, agentSk } from '@dawalnut/telephony-shared';
 
const key = { pk: tenantPk(tenantId), sk: agentSk(agentId) };

Eight functions, zero magic. They just concatenate a prefix with an ID. But they eliminate an entire class of typo bugs and make the DynamoDB operations scannable at a glance.

Event-Driven Side Effects

The platform uses EventBridge as its nervous system. The AI Lambda publishes two event types:

  • telephony.agent.turn — fired after each conversation turn
  • telephony.agent.outcome — fired when the agent resolves or escalates

Three rules route these events:

  1. Disconnect rule — listens for Connect's native DISCONNECTED event → Analytics Lambda
  2. Outcome rule — listens for telephony.agent.outcome → Analytics Lambda
  3. Webhook rule — listens for all dawalnut.telephony events → Webhook Delivery Lambda

The Webhook Delivery Lambda loads the tenant's webhook configuration from DynamoDB and dispatches to each endpoint with HMAC, Bearer, or API key authentication. An SQS dead-letter queue catches delivery failures for retry.

Every side effect is asynchronous and decoupled from the call path. The caller never waits for analytics or webhooks.

Config Validation at Synth Time

Following the same pattern as the portfolio infrastructure, all CDK context values pass through Zod schemas before any resources are created:

const stageConfigSchema = z.object({
  accountId: z.string().regex(/^\d{12}$/),
  region: z.string().min(1),
  domainName: z.string().min(1),
  rootZoneId: z.string().min(1),
  delegationRoleArn: z.string().min(1),
  connectInstanceAlias: z.string().min(1),
});

If someone misconfigures a stage, cdk synth fails immediately with a clear Zod error instead of deploying a half-broken stack.

What's Next

In Part 2, I'll walk through the AI conversation engine — how the Lambda manages multi-turn history, handles tool use, and stays within guardrails.

Related Projects