Skip to content

aws-samples/sample-bedrock-invocation-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

40 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Bedrock Invocation Analytics

English | δΈ­ζ–‡

Real-time analytics for Amazon Bedrock β€” monitor token usage, costs, and performance across AWS accounts.

⚠️ This sample is for demonstration purposes only and is not intended for production use. Use at your own risk.

Features

  • Summary cards: invocations, input/output tokens, cache tokens, estimated cost, avg latency, avg TPOT
  • Token usage & cost by model and by caller (chart / pie / table views) with per-token-type cost breakdown (input / output / cache read / cache write)
  • Performance: latency by model (min/avg/max) + latency trend with model selector
  • TPOT by model (min/avg/max) + TTFT trend from CloudWatch (avg/p99)
  • Usage trend over time with paired model/caller filter
  • "Data up to" timestamp in header β€” reflects actual data freshness (L2 checkpoint), not UI refresh time
  • Auto refresh (10s / 30s / 1min / 5min)
  • Time-aware pricing: correct historical costs even if prices change; separate 5min vs 1h prompt cache rates
  • Pricing settings page: view/edit model pricing with history, weekly auto-sync from LiteLLM
  • Ad-hoc Athena queries on the raw Iceberg event log for deep investigation
  • Multi-account, multi-region support (sidebar selector, friendly account names from config.yaml)
  • Login authentication (configurable via config.yaml)
  • Responsive layout (desktop & mobile)

Screenshot

WebUI

Project Structure

β”œβ”€β”€ deploy/
β”‚   β”œβ”€β”€ cdk.json              # CDK config
β”‚   β”œβ”€β”€ app.py                # CDK app entry (hub/spoke routing)
β”‚   β”œβ”€β”€ hub_stack.py          # Primary account stack
β”‚   β”œβ”€β”€ spoke_stack.py        # Spoke account stack
β”‚   └── lambda/
β”‚       β”œβ”€β”€ parse_log.py      # L1: S3 event β†’ normalized JSON β†’ Firehose
β”‚       β”œβ”€β”€ compute_cost.py   # L2: Athena (Iceberg) β†’ pricing β†’ DynamoDB
β”‚       β”œβ”€β”€ aggregate_stats.py # Rollup: HOURLY β†’ DAILY β†’ MONTHLY
β”‚       β”œβ”€β”€ sync_pricing.py   # Weekly pricing sync from LiteLLM
β”‚       └── process_log.py    # (legacy V2 path, retained for reference)
β”œβ”€β”€ webui/
β”‚   β”œβ”€β”€ main.py               # Entry point (ui.run)
β”‚   β”œβ”€β”€ dashboard.py          # Dashboard page
β”‚   β”œβ”€β”€ pricing.py            # Pricing settings page
β”‚   └── data.py               # DynamoDB data access
β”œβ”€β”€ scripts/
β”‚   └── seed_pricing.py       # Seed pricing from LiteLLM
β”œβ”€β”€ config.example.yaml       # Multi-account deployment config
β”œβ”€β”€ deploy.sh                 # CDK deploy script (hub/spoke/all/destroy)
β”œβ”€β”€ start-webui.sh            # WebUI launch script (reads .env.deploy)
└── pyproject.toml            # Dependencies (managed by uv)

Architecture

Two-stage pipeline: L1 (parse) turns every Bedrock call into a structured event in an Iceberg table; L2 (compute) rolls those events up by pricing into DynamoDB for the dashboard to read.

Architecture

ASCII version (for AI/text access)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Primary Account (Hub)                                                        β”‚
β”‚                                                                              β”‚
β”‚  S3 logs ──→ EventBridge ──→ Lambda: parse_log ──→ Firehose ──→ S3 Tables    β”‚
β”‚  (Bedrock)                   [L1: structure only]  (60s buffer) (Iceberg)    β”‚
β”‚                                                           β”‚                  β”‚
β”‚                                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€                  β”‚
β”‚                                  β”‚                        β”‚                  β”‚
β”‚                                  β–Ό every 5 min            β”‚                  β”‚
β”‚                    Lambda: compute_cost                   β”‚                  β”‚
β”‚                    [L2: pricing + aggregation]            β”‚                  β”‚
β”‚                          β”‚                                β”‚                  β”‚
β”‚                          β–Ό                                β–Ό                  β”‚
β”‚                    DynamoDB: usage-stats          Athena: ad-hoc query       β”‚
β”‚                    (serving layer for UI)         (via Glue federation)      β”‚
β”‚                          β”‚                                                   β”‚
β”‚                          β–Ό                                                   β”‚
β”‚                         WebUI                                                β”‚
β”‚                                                                              β”‚
β”‚  DynamoDB: model-pricing ◄── Lambda: sync_pricing (weekly)                   β”‚
β”‚  Lambda: aggregate_stats (HOURLY β†’ DAILY β†’ MONTHLY, daily/monthly)           β”‚
β”‚  IAM Role: SpokeWriteRole (assumed by spokes to write hub Firehose)          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β–² assume role + cross-account firehose:PutRecord
       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Spoke Account(s)                                                             β”‚
β”‚                                                                              β”‚
β”‚  S3 logs ──→ EventBridge ──→ Lambda: parse_log ──→ Hub Firehose              β”‚
β”‚  (Bedrock)                                                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How it works:

  1. Bedrock logs land in each account's own S3 bucket (Bedrock requires same-account/region sinks).
  2. L1 parse_log (per-account Lambda, triggered by S3 events) normalizes each record into a flat JSON event (account, region, model, caller, token counts, cache split, latency, error code) and hands it to Hub's Firehose via PutRecordBatch. Spoke Lambdas assume a cross-account role.
  3. Firehose β†’ S3 Tables: Firehose buffers ~60s then upserts into the Iceberg table bedrock_analytics.usage_events, using request_id as the unique key β€” this is the source of truth for every Bedrock call, also directly queryable via Athena for ad-hoc investigation.
  4. L2 compute_cost (hub only, EventBridge every 5 min) reads new events from Iceberg via Athena, looks up time-aware pricing in DynamoDB, computes cost (splitting 5m vs 1h prompt cache), and aggregates into DynamoDB via TransactWriteItems with a dedup guard. Separating compute from parse means fixing a pricing bug or changing aggregation logic re-runs L2 on historical events β€” no re-parsing raw S3.
  5. aggregate_stats rolls hourly β†’ daily β†’ monthly on schedule. sync_pricing pulls the latest model prices from LiteLLM weekly.
  6. WebUI reads DynamoDB (sub-second) as a serving-layer cache. The header shows "Data up to X" based on L2's checkpoint, so users can tell real-time data freshness apart from UI refresh time.

Prerequisites

  • AWS CDK CLI (npm install -g aws-cdk)
  • uv (Python package manager)
  • AWS credentials configured (aws configure or ~/.aws/credentials)

Deploy

Copy config.example.yaml to config.yaml and fill in your AWS profiles, regions, and account names. The account marked primary: true deploys the full hub stack (DynamoDB, Iceberg, Firehose, WebUI); others deploy a lightweight spoke that forwards events to the hub.

# Install dependencies
uv sync

# Deploy primary account (auto-bootstraps CDK if needed)
./deploy.sh hub

# Deploy spoke account(s)
./deploy.sh spoke              # all spokes
./deploy.sh spoke lab          # specific spoke

# Deploy everything (recommended for updates)
./deploy.sh all

Note: After code updates, use ./deploy.sh all to ensure both hub and spoke Lambdas are updated.

For existing buckets, enable S3 EventBridge notifications:

aws s3api put-bucket-notification-configuration --bucket YOUR_BUCKET \
  --notification-configuration '{"EventBridgeConfiguration": {}}'

Deployed Resources

Primary account (Hub):

Resource Purpose
S3 Bucket (optional) Raw Bedrock invocation logs (encrypted, lifecycle)
Custom Resource Configures Bedrock invocation logging
DynamoDB Γ— 2 usage-stats (serving layer + DEDUP + META) and model-pricing (time-aware)
S3 Tables bucket + namespace + Iceberg table usage_events β€” source of truth, queryable via Athena
Glue Data Catalog (federated) s3tablescatalog pointing at the S3 Tables bucket
Lake Formation settings Registers CDK deploy role as admin (required for pure-IAM access to Iceberg)
Firehose delivery stream S3 Tables destination, 60s buffer, request_id upsert key
Athena workgroup For compute_cost and ad-hoc queries
Lambda Γ— 6 parse_log (L1), compute_cost (L2), aggregate_stats, sync_pricing, process_log (legacy), bedrock-invocation-setup (Custom Resource handler)
EventBridge Γ— 5 S3 trigger, v3 S3 trigger, L2 schedule, daily & monthly rollup, weekly pricing sync
IAM Roles Firehose delivery role (pre-created by deploy.sh to avoid IAM-propagation race), Lambda execution roles, SpokeWriteRole trusted by spoke accounts

Spoke accounts:

Resource Purpose
S3 Bucket (optional) Raw Bedrock logs
Custom Resource Configures Bedrock invocation logging
Lambda Γ— 2 parse_log (L1, assumes hub role to firehose:PutRecord) and process_log (legacy)
EventBridge Γ— 2 Active v3 S3 trigger and disabled legacy trigger
SQS DLQ Dead-letter queue for failed processing

Seed Pricing Data

Pricing data is sourced from LiteLLM (286+ Bedrock models):

AWS_DEFAULT_REGION=us-west-2 python3 scripts/seed_pricing.py \
  BedrockInvocationAnalytics-model-pricing YOUR_PROFILE

Start WebUI

./start-webui.sh

Open http://localhost:8060 in your browser.

Cleanup

./deploy.sh destroy              # destroy hub stack

DynamoDB tables and S3 bucket are retained after stack deletion (RemovalPolicy: RETAIN).

Cost

Service Pricing Notes
Lambda $0.20/M requests (ARM/Graviton) L1 parse_log + L2 compute_cost + rollups
Firehose $0.029/GB ingested + small per-record fee Buffered then Iceberg upsert
S3 Tables (Iceberg) ~$0.023/GB + $0.20/M requests Partitioned by account_id / year / month / day
DynamoDB Pay-per-request Now only L2 aggregates write (β‰ˆ13Γ— fewer writes than V2 since L1 skips DDB)
Athena $5/TB scanned L2 scans one partition per run; ad-hoc queries extra
S3 (raw logs) ~$0.023/GB/month Auto-transitions to IA after 90 days

Monthly estimate (1M Bedrock invocations, Anthropic models with caching):

  • Lambda: ~$0.20 (L1) + ~$0.05 (L2, every 5 min)
  • Firehose + S3 Tables: ~$1 (few hundred MB of Iceberg data)
  • DynamoDB: ~$0.30 (L2 TransactWriteItems, ~3 items/event)
  • Athena: ~$0.10 (L2 scans small hourly partitions; ad-hoc extra)
  • S3 (raw logs): ~$1
  • Total: ~$3/month

Costs scale sub-linearly with invocations β€” Firehose buffering amortizes per-record overhead, and Iceberg partition pruning keeps Athena scans small as history grows.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

πŸ“Š Multi-account analytics for Amazon Bedrock. Hub + Spoke architecture aggregates invocation logs across AWS accounts into DynamoDB; NiceGUI WebUI shows token usage, cost breakdown, latency, and TPOT in real time.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors