Skip to main content
AI SystemsArchitecture Confidence: High

AI Document Assistant Architecture Template

Upload PDFs, extract embeddings, and query documents with AI. Generate a complete cloud architecture with cost estimates, Terraform, sequence diagrams, CLI deployment workflows, and a GitHub Actions pipeline — on AWS, Azure, or GCP.

Generates forAWSAzureGCP
Cost Estimates
AWS$200 / month
Azure$231 / month
GCP$211 / month

Production estimates. Your workspace generates actuals.

Architecture Overview

Queues uploaded PDFs for async text extraction, generates embeddings, indexes them for semantic search, and exposes a REST API to query documents in natural language with per-user access controls.

Services Selected

~14

cloud services

API GatewayLambdaS3EventBridgeSQS+9 more
Cloud Provider

AWS Architecture Diagram

Full topology with all services and request flows — switch providers above to compare.

Cloud Provider
AWS Architecture DiagramProduction flow SVG - implementation-order handoffs
100%
AWS AI Document AssistantPRODUCTION IMPLEMENTATION PATHDocument Ingestion and Indexing (Async)STORE FILE FIRST, THEN PROCESS ASYNCHRONOUSLYstore filefirstblob eventbuffer jobextracttextdispatchchunksgeneratevectorsindexchunksstoremetadata1User2Amazon CloudFront3Amazon API Gateway4Upload API5Amazon S36Amazon EventBridge7Amazon SQSExtraction Queue8Extraction Worker9Amazon Textract10Amazon SQS ChunkingQueue11Embedding Worker12Amazon BedrockEmbeddings13OpenSearchServerless14Metadata StoreQuery and Response (Sync)RETRIEVE CHUNKS, GROUND THE ANSWER, RETURN RESPONSEretrievechunksgroundanswerreturnresponse1User2Amazon CloudFront3Amazon API Gateway4Query API5OpenSearchServerless6Amazon Bedrock Chat7Amazon DynamoDBMetadata8ResponseCross-Cutting ServicesSECRETS, TRACES, METRICS, ALERTS, AND CENTRALIZED LOGSSECRETS USED BY APIS AND WORKERSTRACES, METRICS, ALERTS, AND CENTRALIZED LOGSSecrets Managersecrets used by APIs and workersAWS X-Raytraces from APIs and workersAmazon CloudWatchmetrics and alertsCloudWatch Logscentralized logs

AI Document Assistant - AWS - Production implementation lanes - CloudDesign AI

Architecture Breakdown

Every major component, what it does, and the AWS service powering it.

AWS

API Gateway

Amazon API Gateway

Routes, authenticates, and rate-limits incoming requests.

AWS

Upload API

Amazon API Gateway

Routes, authenticates, and rate-limits incoming requests.

AWS

Document Store

Amazon S3

Stores and retrieves data with durability and access controls.

AWS

Blob Event Trigger

Amazon EventBridge

Handles business logic and integrates with surrounding services.

AWS

Extraction Queue

Amazon SQS

Decouples producers from consumers for async processing.

AWS

Extraction Worker

Amazon SQS

Handles business logic and integrates with surrounding services.

AWS

Text Extraction

Amazon SQS

Handles business logic and integrates with surrounding services.

AWS

Chunking Queue

Amazon SQS

Decouples producers from consumers for async processing.

AWS

Embedding Worker

AWS Lambda

Handles business logic and integrates with surrounding services.

AWS

Embedding Model

AWS Lambda

Handles business logic and integrates with surrounding services.

AWS

Vector Search

Amazon OpenSearch Service

Indexes and retrieves content with full-text and vector search.

AWS

Query API

Amazon API Gateway

Routes, authenticates, and rate-limits incoming requests.

AWS

Chat Model

Amazon Bedrock

Handles business logic and integrates with surrounding services.

AWS

Metadata Store

Amazon DynamoDB

Stores and retrieves data with durability and access controls.

Cost Estimate — AWS

Representative production estimate. Your workspace generates a breakdown based on your actual configuration.

AWS$200 / month estimated

S3

Document storage

$5/mo

SQS

Processing queue

$4/mo

Lambda

Extraction & query

$12/mo

Textract

OCR per page

$25/mo

Bedrock

Embedding & inference

$60/mo

OpenSearch

Vector index

$72/mo

DynamoDB

Usage tracking

$8/mo

API Gateway

REST API

$14/mo

Total estimate

$200 / month

What CloudDesign AI Generates

Every generation produces a complete set of production-ready artifacts.

🗺️

Architecture Diagram

Full topology showing every service and how traffic flows between them.

↔️

Sequence Diagrams

Request lifecycle flows for upload, query, and overall system paths.

💰

Cost Analysis

Per-service cost breakdown with total estimate for the selected provider.

🏗️

Terraform Code

Complete infrastructure-as-code export you can deploy immediately.

⚙️

CLI Deployment Workflow

Ordered provisioning commands for every service in the architecture.

🚀

GitHub Actions Pipeline

Ready-to-commit `.github/workflows/terraform.yml` for CI/CD.

⚖️

Tradeoff Analysis

Cost, scalability, reliability, and operational complexity breakdown.

Production Checklist

Architecture-specific risks and mitigations before you go live.

Terraform Preview — AWS

Provider-specific infrastructure code. The full export is available after generating.

main.tf — AWS
Full export after generation
resource "aws_s3_bucket" "documents" {
  bucket = "${var.prefix}-documents"
  force_destroy = false
}

resource "aws_sqs_queue" "ingestion" {
  name                       = "${var.prefix}-ingestion"
  visibility_timeout_seconds = 300
}

resource "aws_opensearch_domain" "vectors" {
  domain_name    = "${var.prefix}-vectors"
  engine_version = "OpenSearch_2.11"
}

# + 280 more lines — generate the full export →

Full Terraform export includes: variables, outputs, IAM roles, environment configs, and module structure.

Generate Full Terraform

CLI Preview — AWS

Ordered provisioning commands for every service. The full workflow is generated in your workspace.

deploy.sh — AWS
Full workflow after generation
aws s3api create-bucket --bucket $PREFIX-documents --region $REGION
aws sqs create-queue --queue-name $PREFIX-ingestion \
  --attributes VisibilityTimeout=300
aws opensearch create-domain --domain-name $PREFIX-vectors \
  --engine-version OpenSearch_2.11
aws lambda create-function --function-name $PREFIX-extractor \
  --runtime python3.12 --handler handler.main

# + 22 more commands — generate the full workflow →

Full CLI workflow includes: bucket creation, networking, IAM setup, application deployment, and health checks — in order.

Generate Full CLI Workflow

Cloud Provider Mapping

Every architectural function mapped to its native service on AWS, Azure, and GCP.

FunctionAWSAzureGCP
CDN / EdgeAmazon CloudFrontAzure Front Door PremiumCloud CDN
WAF / DDoSAWS WAF + ShieldAzure WAF + DDoS ProtectionCloud Armor
API GatewayAmazon API GatewayAzure API ManagementAPI Gateway
Auth / RolesAmazon CognitoAzure AD B2CFirebase Auth
Upload APIAWS LambdaAzure FunctionsCloud Functions
Document StoreAmazon S3Azure Blob StorageCloud Storage
Blob Event TriggerAmazon EventBridgeAzure Event GridEventarc
Extraction QueueAmazon SQSAzure Service BusCloud Pub/Sub
Extraction WorkerAWS LambdaAzure FunctionsCloud Functions
Text ExtractionAmazon TextractAzure Document IntelligenceDocument AI
Chunking QueueAmazon SQSAzure Service BusCloud Pub/Sub
Embedding WorkerAWS LambdaAzure FunctionsCloud Functions
Embedding ModelAmazon BedrockAzure OpenAI EmbeddingsVertex AI Embeddings
Query APIAWS LambdaAzure FunctionsCloud Run
Chat ModelAmazon BedrockAzure OpenAI ChatVertex AI Chat
Vector SearchAmazon OpenSearch ServiceAzure AI SearchVertex AI Search
Metadata StoreAmazon DynamoDBAzure Cosmos DB / PostgreSQLCloud Firestore / Cloud SQL
Secrets ManagementAWS Secrets ManagerAzure Key VaultGCP Secret Manager
Application TracesAWS X-RayAzure Application InsightsCloud Trace
Metrics and AlertsAmazon CloudWatchAzure MonitorCloud Monitoring
Centralized LogsCloudWatch LogsAzure Log AnalyticsCloud Logging

Architecture Tradeoffs

How AWS, Azure, and GCP compare across the dimensions that matter most for this architecture.

Cost Efficiency

AWS
4
Azure
3
GCP
4

AWS and GCP offer competitive OCR pricing; Azure Document Intelligence costs more per page at scale.

Scalability

AWS
5
Azure
4
GCP
5

All providers scale well; GCP Vertex AI Search and AWS OpenSearch both handle billions of vectors.

AI/ML Ecosystem

AWS
4
Azure
5
GCP
4

Azure OpenAI has the tightest GPT integration; Bedrock and Vertex AI both support multiple model families.

Operational Simplicity

AWS
3
Azure
4
GCP
4

GCP and Azure managed services require less cluster management than self-managed OpenSearch.

Security & Compliance

AWS
5
Azure
5
GCP
4

AWS and Azure have broader compliance certification portfolios for regulated industries.

Production Risks for This Architecture

Known failure modes with concrete mitigations — included in every generated checklist.

1

Lambda timeout on large PDFs: documents over 50MB with dense text can exceed 15-minute execution limits — split into page-chunked jobs via SQS

2

Embedding cost runaway: generating embeddings for every page of every upload at scale costs more than expected — implement deduplication by content hash before embedding

3

RAG accuracy degrades on scanned PDFs with poor OCR quality — add a confidence threshold on Textract output and flag low-confidence documents for user review

Key Capabilities Covered

PDF upload + async extraction
Embedding generation queue
Vector search (RAG)
Secure object storage
Per-user usage tracking

Frequently Asked Questions

Common questions about this architecture and what CloudDesign AI generates.

AWSAzureGCP

Generate the AI Document Assistant Architecture

Get the full architecture diagram, cost breakdown, Terraform, CLI workflow, and GitHub Actions pipeline — specific to your chosen cloud provider.

Free account · No credit card required · 5 architecture runs per month