Day 05 Integration & Deployment

Production Patterns: Error Handling, Rate Limits, and Caching

Day 5 covers the production patterns that separate hobby projects from reliable apps: proper error handling, API rate limit management, response caching, and cost optimization.

~1 hour Hands-on Precision AI Academy

Today’s Objective

Day 5 covers the production patterns that separate hobby projects from reliable apps: proper error handling, API rate limit management, response caching, and cost optimization.

import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); async function callWithRetry(params, maxRetries = 3) { for (let attempt = 1; attempt <= maxRetries; attempt++) { try { return await client.messages.create(params); } catch (error) { if (error instanceof Anthropic.RateLimitError) { // Wait before retrying (exponential backoff) const wait = Math.pow(2, attempt) * 1000; console.log(`Rate limited. Waiting ${wait}ms...`); await new Promise(r => setTimeout(r, wait)); } else if (error instanceof Anthropic.APIError && error.status >= 500 && attempt < maxRetries) { // Server error, retry await new Promise(r => setTimeout(r, 1000 * attempt)); } else { throw error; // Don't retry auth errors or client errors } } } throw new Error('Max retries exceeded'); }

Simple Response Caching

In-Memory Cache
IN-MEMORY CACHE
import crypto from 'crypto';

const cache = new Map();
const CACHE_TTL = 60 * 60 * 1000; // 1 hour

function cacheKey(params) { return crypto.createHash('md5') .update(JSON.stringify(params)) .digest('hex');
}

async function cachedChat(params) { const key = cacheKey(params); const cached = cache.get(key); if (cached && Date.now() - cached.timestamp < CACHE_TTL) { console.log('Cache hit'); return cached.response; } const response = await client.messages.create(params); cache.set(key, { response, timestamp: Date.now() }); return response;
}

// Use for deterministic queries (FAQ, document analysis)
// Don't cache conversational messages

Token Budget and Cost Control

Token Estimation
TOKEN ESTIMATION
// Rough token estimate: ~4 chars per token
function estimateTokens(text) { return Math.ceil(text.length / 4);
}

// Track usage per session
function trackUsage(session, response) { if (!session.totalTokens) session.totalTokens = 0; session.totalTokens += response.usage.input_tokens + response.usage.output_tokens; // Warn if approaching budget if (session.totalTokens > 50000) { console.warn('Session token budget warning:', session.totalTokens); }
}

// Cost calculation (claude-opus-4-5 rates as of 2026)
function estimateCost(inputTokens, outputTokens) { const inputCost = inputTokens * 0.000003; // $3/M input tokens const outputCost = outputTokens * 0.000015; // $15/M output tokens return inputCost + outputCost;
}
Day 5 ExerciseHarden Your App for Production
  1. Replace all direct API calls in your server with callWithRetry().
  2. Add the response cache to your document analysis endpoint.
  3. Add token tracking to your session object — log usage after each exchange.
  4. Set a max_tokens limit based on your budget and test that long responses truncate gracefully.
  5. Deploy to Railway or a cloud provider. Test the error handling by temporarily using an invalid API key.

Want to go deeper in 2 days?

Our in-person AI bootcamp covers advanced AI development, agentic systems, and production deployment. Five cities. $1,490.

Reserve Your Seat →

Supporting Resources

Go deeper with these references.

Anthropic
Claude API Reference Official documentation for the Messages API, tool use, and streaming.
npm
@anthropic-ai/sdk Official Node.js SDK for the Anthropic API with TypeScript support.
GitHub
Anthropic Cookbook Official Anthropic code examples for common JavaScript + Claude patterns.

Day 5 Checkpoint

Before moving on, make sure you can answer these without looking:

Course Complete
Return to JavaScript + AI APIs Overview