Blog · Tag
#aws
12 posts tagged #aws.
Field Notes: Turning prompt caching on for a production Bedrock workload
Strands' BedrockModel ships with prompt caching off. Two kwargs turn it on, one per-model gotcha catches you, and a 10-turn driver measures 99.9% / 99.8% hit ratios on Nova Pro and Sonnet 4.6 against an 8,156-token production system prefix. The per-call usage block proves it in seconds, not waiting on CloudWatch.
Field Notes: Three things I learned diagnosing a production Bedrock workload
Three findings from a real customer engagement on AWS Bedrock — what a load test was actually doing, why p95 latency was 45 seconds, and the prompt-caching default that costs every team money. Plus the three CloudWatch metrics that catch all three.
What a Year 10 study system taught me about production AI failure modes
A personal Bedrock-adjacent build that went through three iterations and an architecture pivot. Five lessons that map directly to production AWS AI work.
Part 2: The MCP Server — Turning ADRs and Incidents into a Queryable Org-Knowledge Surface
The agent doesn't read your wiki. It calls four tools that pull frontmatter-filtered chunks out of a Bedrock Knowledge Base. Here's the contract, the code, and the small decisions that make the difference between an agent that reads your docs and one that knows your org.
Part 3: Wiring It Into AWS DevOps Agent — AgentSpace, register-service, and the IAM Trust Policy That Ate My Afternoon
The MCP server is done. Now we plug it into AWS DevOps Agent: three CDK stacks, the AgentSpace + register-service flow, the composite-principal trust policy that you will get wrong on the first try, and a real-world OIDC gotcha that broke my own blog deploy for a month.
Part 1: Intent vs State — How AWS DevOps Agent Closes the Gap Between What Your System Is and What You Decided It Should Be
When something breaks at 3am, you look at logs, metrics, traces. You don't go and re-read the ADR your team wrote in January. AWS DevOps Agent does. Here's why that changes the first hour of an incident.
Part 6: Cost & Performance for Bedrock AgentCore — Prompt Caching, Model Selection, and CloudWatch Alarms
Real cost breakdown of running an AgentCore agent: prompt caching savings, when to use Nova Pro vs Claude Sonnet, PriceClass_100, idle timeouts, and how to set alarms before your bill surprises you.
Part 5: CI/CD for Bedrock AgentCore with GitHub Actions and AWS OIDC (No Stored Credentials)
How to build a complete CI/CD pipeline for AgentCore using GitHub Actions OIDC: no stored AWS keys, dual-tag ECR strategy, automated Runtime updates, and multi-environment promotion.
Part 4: Running Your AgentCore Agent Locally with Docker (The Right Way)
How to build and run your AgentCore container locally with real AWS credentials, the correct linux/amd64 platform flag, the .env.local pattern, and how to test with curl.
Part 3: Building the AI Agent with Strands Agents SDK, Prompt Caching, and AgentCore Memory
How to build the Python agent that runs inside AgentCore: Strands SDK setup, prompt caching that cuts costs by 90%, dual-model strategy, tool definitions, and AgentCore Memory integration.
Part 2: CDK Infrastructure for Amazon Bedrock AgentCore (And Every Gotcha You'll Hit)
A complete CDK v2 TypeScript stack for Bedrock AgentCore — with inline comments for every deployment trap: naming constraints, ECR bootstrap, missing L1 constructs, VPC endpoint conflicts, and more.
Part 1: Why I Chose Amazon Bedrock AgentCore (And What Lambda Gets Wrong for AI Agents)
Before writing a single line of agent code, I spent a week figuring out where to run it. Here's the architecture decision that changed everything — and the Lambda limitations that forced my hand.