Day 05 Integration & Deployment

Build a Data Pipeline — SQL + Python + AI

Wire everything together. Query a SQLite database from Python, transform the data with SQL, feed the results to Claude for analysis. A complete, working pipeline you can show in your portfolio.

~1 hour Hands-on Precision AI Academy

Today’s Objective

Wire everything together. Query a SQLite database from Python, transform the data with SQL, feed the results to Claude for analysis. A complete, working pipeline you can show in your portfolio.

A Python script that queries your sales database, generates a revenue summary using SQL, and calls Claude to produce a written analysis — all in under 80 lines of code. This is what a data pipeline looks like in the real world.

Install dependencies

You need two packages: anthropic (Claude's Python library) and pandas (data manipulation). SQLite3 is built into Python — nothing to install.

bash
BASH
pip install anthropic pandas

Get your Anthropic API key from console.anthropic.com. Set it as an environment variable:

bash
BASH
# Mac / Linux
export ANTHROPIC_API_KEY="sk-ant-..."

# Windows (Command Prompt)
set ANTHROPIC_API_KEY=sk-ant-...

Python + SQLite3 — Querying Programmatically

Python's built-in sqlite3 module connects to any SQLite database file. You can run any SQL query and get results back as a list of tuples or a pandas DataFrame.

query_db.py
PYTHON
import sqlite3
import pandas as pd

# Connect to the database
conn = sqlite3.connect("sales.db")

# Run a SQL query — get results as a DataFrame
query = """
SELECT c.name AS customer, c.segment, COUNT(*) AS orders, SUM(o.amount) AS total_revenue, AVG(o.amount) AS avg_order
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
GROUP BY c.name, c.segment
ORDER BY total_revenue DESC
"""

df = pd.read_sql_query(query, conn)
conn.close()

print(df.to_string(index=False))

Run it: python query_db.py. You'll see a formatted table of customer revenue data in your terminal.

Feed the Data to Claude

Now the interesting part — take the query results and ask Claude to analyze them. The pipeline: query → format as text → send to Claude → print the analysis.

pipeline.py
PYTHON
import sqlite3
import pandas as pd
import anthropic
import os

# 1. Query the database
conn = sqlite3.connect("sales.db")
query = """
SELECT c.name, c.segment, COUNT(*) AS orders, ROUND(SUM(o.amount), 2) AS total_revenue, ROUND(AVG(o.amount), 2) AS avg_order
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
GROUP BY c.name, c.segment
ORDER BY total_revenue DESC
"""
df = pd.read_sql_query(query, conn)
conn.close()

# 2. Format as a clean text table
data_text = df.to_string(index=False)
total = df['total_revenue'].sum()

# 3. Send to Claude for analysis
client = anthropic.Anthropic()

prompt = f"""Here is a sales revenue summary from our database:

{data_text}

Total revenue: ${total:,.2f}

Please analyze this data in 3-4 sentences. Identify:
- Who the top customer is and their share of revenue
- Any patterns in the segment data
- One actionable recommendation for the sales team"""

message = client.messages.create( model="claude-3-5-haiku-20241022", max_tokens=512, messages=[{"role": "user", "content": prompt}]
)

# 4. Print the analysis
print("\n=== DATA ===\n")
print(data_text)
print(f"\nTotal revenue: ${total:,.2f}")
print("\n=== AI ANALYSIS ===\n")
print(message.content[0].text)

claude-3-5-haiku is Claude's fastest and cheapest model — perfect for pipelines that run frequently. For deeper analysis, swap in claude-3-5-sonnet-20241022.

Wrap it in a function and schedule it

Production pipelines don't run once. They run on a schedule — daily, weekly, or triggered by an event. Here's a clean version you can extend:

pipeline_v2.py
PYTHON
def run_sales_report(db_path: str, query: str, prompt_template: str) -> str: """Query a SQLite DB, format results, get AI analysis. Returns the analysis.""" import sqlite3, pandas as pd, anthropic # Query conn = sqlite3.connect(db_path) df = pd.read_sql_query(query, conn) conn.close() # Analyze client = anthropic.Anthropic() prompt = prompt_template.format(data=df.to_string(index=False)) msg = client.messages.create( model="claude-3-5-haiku-20241022", max_tokens=512, messages=[{"role": "user", "content": prompt}] ) return msg.content[0].text

# Use it
if __name__ == "__main__": analysis = run_sales_report( db_path="sales.db", query="SELECT name, SUM(amount) AS total FROM orders o JOIN customers c ON o.customer_id=c.id GROUP BY name ORDER BY total DESC", prompt_template="Analyze this sales data in 3 sentences:\n\n{data}" ) print(analysis)

To schedule this, use cron on Mac/Linux or Windows Task Scheduler. Or deploy it as a GitHub Action that runs on a schedule.

100%
Course Complete

You finished SQL for AI and Data

You went from your first SELECT to building an AI-powered data pipeline. That's 5 days of real, usable SQL skill. Where you go next is up to you.

Join the Live Bootcamp → Explore More Courses

Day 5 Checkpoint

Before moving on, make sure you can answer these without looking:

  • What is the core concept introduced in this lesson, and why does it matter?
  • What problem does Build solve that simpler approaches cannot?
  • Can you trace through the main code example in this lesson and explain each step?
  • What are the most common mistakes made when first learning this concept?
  • How would you explain today’s topic to a colleague who has never seen it before?
Course Complete
Return to SQL + AI Overview