The Problem
Claude.ai Pro imposes a rolling message quota and a 4-hour session limit. Hit either mid-task and your flow is destroyed — context lost, momentum gone, session reset. For sustained engineering work this is a recurring interruption at the worst possible moment.
Beyond limits, every prompt and response routes through Anthropic’s infrastructure. For sensitive code, client data, or regulated environments, that’s a data sovereignty concern with no mitigation.
The Approach
Shell scripts that authenticate via MFA, obtain short-lived STS credentials, and launch Claude Code against your own AWS Bedrock endpoint. No subscription, no session limits, no message quotas. Your prompts never leave your AWS account.
The launcher automatically probes for the best available model (Opus > Sonnet > Haiku) using a live 1-token invocation with an 8-second timeout, caches the result for the session, and notifies if a newer same-tier model exists but isn’t yet accessible.
Operational roles extend Claude’s authority beyond model invocation — read-only analysis, log inspection, or full administrative access — via IAM AssumeRole with MFA-gated trust policies. Each role is a CloudFormation stack with an explicit policy document, deployed and versioned alongside the project.
The Outcome
Uninterrupted Claude Code sessions at ~$3/month. Complete data sovereignty. Layered security (Keychain, MFA, temporary credentials, scoped IAM). Self-contained — clone, deploy, use.
Stack
| Technology | Purpose |
|---|---|
| AWS Bedrock | Model inference (Claude Opus, Sonnet, Haiku) |
| AWS IAM | Least-privilege policies, MFA enforcement |
| AWS STS | Temporary session credentials (6-hour expiry) |
| AWS CloudFormation | Operational role provisioning |
| macOS Keychain | Credential storage (AES-256, never plaintext) |
| Claude Code | AI coding assistant CLI |
| Bash | Launcher, authentication, deployment scripts |
Repository: claude-personal
AWS Well-Architected Alignment
- Operational Excellence: Automated model selection and session management; zero manual AWS console steps after initial setup
- Security: MFA-enforced at every layer; credentials never touch disk; temporary sessions expire automatically; operational roles scoped by explicit policy documents
- Reliability: Automatic fallback across model tiers; session caching prevents redundant probes; graceful handling of throttled or unavailable models
- Performance Efficiency: 1-token probe with timeout distinguishes accessible from practically usable models; session cache eliminates repeated API calls
- Cost Optimization: Pay-per-token (~$3/month typical); no subscription; Cost Explorer integration for visibility
- Sustainability: No idle infrastructure; serverless inference; ephemeral credentials with no rotation overhead