Wire everything together. Query a SQLite database from Python, transform the data with SQL, feed the results to Claude for analysis. A complete, working pipeline you can show in your portfolio.
A Python script that queries your sales database, generates a revenue summary using SQL, and calls Claude to produce a written analysis — all in under 80 lines of code. This is what a data pipeline looks like in the real world.
You need two packages: anthropic (Claude's Python library) and pandas (data manipulation). SQLite3 is built into Python — nothing to install.
pip install anthropic pandas
Get your Anthropic API key from console.anthropic.com. Set it as an environment variable:
# Mac / Linux export ANTHROPIC_API_KEY="sk-ant-..." # Windows (Command Prompt) set ANTHROPIC_API_KEY=sk-ant-...
Python's built-in sqlite3 module connects to any SQLite database file. You can run any SQL query and get results back as a list of tuples or a pandas DataFrame.
import sqlite3 import pandas as pd # Connect to the database conn = sqlite3.connect("sales.db") # Run a SQL query — get results as a DataFrame query = """ SELECT c.name AS customer, c.segment, COUNT(*) AS orders, SUM(o.amount) AS total_revenue, AVG(o.amount) AS avg_order FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.status = 'completed' GROUP BY c.name, c.segment ORDER BY total_revenue DESC """ df = pd.read_sql_query(query, conn) conn.close() print(df.to_string(index=False))
Run it: python query_db.py. You'll see a formatted table of customer revenue data in your terminal.
Now the interesting part — take the query results and ask Claude to analyze them. The pipeline: query → format as text → send to Claude → print the analysis.
import sqlite3 import pandas as pd import anthropic import os # 1. Query the database conn = sqlite3.connect("sales.db") query = """ SELECT c.name, c.segment, COUNT(*) AS orders, ROUND(SUM(o.amount), 2) AS total_revenue, ROUND(AVG(o.amount), 2) AS avg_order FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.status = 'completed' GROUP BY c.name, c.segment ORDER BY total_revenue DESC """ df = pd.read_sql_query(query, conn) conn.close() # 2. Format as a clean text table data_text = df.to_string(index=False) total = df['total_revenue'].sum() # 3. Send to Claude for analysis client = anthropic.Anthropic() prompt = f"""Here is a sales revenue summary from our database: {data_text} Total revenue: ${total:,.2f} Please analyze this data in 3-4 sentences. Identify: - Who the top customer is and their share of revenue - Any patterns in the segment data - One actionable recommendation for the sales team""" message = client.messages.create( model="claude-3-5-haiku-20241022", max_tokens=512, messages=[{"role": "user", "content": prompt}] ) # 4. Print the analysis print("\n=== DATA ===\n") print(data_text) print(f"\nTotal revenue: ${total:,.2f}") print("\n=== AI ANALYSIS ===\n") print(message.content[0].text)
claude-3-5-haiku is Claude's fastest and cheapest model — perfect for pipelines that run frequently. For deeper analysis, swap in claude-3-5-sonnet-20241022.
Production pipelines don't run once. They run on a schedule — daily, weekly, or triggered by an event. Here's a clean version you can extend:
def run_sales_report(db_path: str, query: str, prompt_template: str) -> str: """Query a SQLite DB, format results, get AI analysis. Returns the analysis.""" import sqlite3, pandas as pd, anthropic # Query conn = sqlite3.connect(db_path) df = pd.read_sql_query(query, conn) conn.close() # Analyze client = anthropic.Anthropic() prompt = prompt_template.format(data=df.to_string(index=False)) msg = client.messages.create( model="claude-3-5-haiku-20241022", max_tokens=512, messages=[{"role": "user", "content": prompt}] ) return msg.content[0].text # Use it if __name__ == "__main__": analysis = run_sales_report( db_path="sales.db", query="SELECT name, SUM(amount) AS total FROM orders o JOIN customers c ON o.customer_id=c.id GROUP BY name ORDER BY total DESC", prompt_template="Analyze this sales data in 3 sentences:\n\n{data}" ) print(analysis)
To schedule this, use cron on Mac/Linux or Windows Task Scheduler. Or deploy it as a GitHub Action that runs on a schedule.
You went from your first SELECT to building an AI-powered data pipeline. That's 5 days of real, usable SQL skill. Where you go next is up to you.
Before moving on, make sure you can answer these without looking: