Deploy an ML Model as an API

Package your trained model into a FastAPI service, containerize it with Docker, and deploy it so anyone can call it over HTTP.

~1 hour Hands-on Precision AI Academy

Today’s Objective

Package your trained model into a FastAPI service, containerize it with Docker, and deploy it so anyone can call it over HTTP.

A production-ready ML API that accepts JSON input, runs it through your sklearn pipeline, and returns predictions with confidence scores — deployed and accessible via a public URL.

Wrap the Model in FastAPI

A model sitting on your laptop is worthless. Wrap it in a REST API and it becomes a service that your frontend, other APIs, and automation tools can all use.

app.py

PYTHON

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
import numpy as np
from typing import List

app = FastAPI(title="Cancer Classifier API")

# Load once on startup
pipeline = joblib.load("model_pipeline.pkl")
model = pipeline["model"]
scaler = pipeline["scaler"]

class PredictRequest(BaseModel): features: List[float] # 30 feature values

class PredictResponse(BaseModel): prediction: int # 0=benign, 1=malignant label: str confidence: float

@app.post("/predict", response_model=PredictResponse)
def predict(req: PredictRequest): if len(req.features) != 30: raise HTTPException(400, "Expected 30 features") X = np.array(req.features).reshape(1, -1) X_scaled = scaler.transform(X) pred = int(model.predict(X_scaled)[0]) prob = float(model.predict_proba(X_scaled)[0, pred]) return PredictResponse( prediction=pred, label="Malignant" if pred == 1 else "Benign", confidence=prob )

Add Health Check and Model Info Endpoints

Production APIs need observability endpoints. A health check lets load balancers verify the service is running. A model info endpoint documents what the API expects.

app.py (continued)

PYTHON

import datetime

@app.get("/")
def health(): return { "status": "ok", "model": "RandomForest v1.0", "timestamp": datetime.datetime.now().isoformat() }

@app.get("/model-info")
def model_info(): return { "algorithm": "Random Forest Classifier", "n_estimators": model.n_estimators, "n_features": model.n_features_in_, "classes": ["Benign (0)", "Malignant (1)"], "training_features": pipeline["features"] }

# Run: uvicorn app:app --reload
# Test: curl -X POST http://localhost:8000/predict \
# -H "Content-Type: application/json" \
# -d '{"features": [17.99,10.38,122.8,1001,0.1184,...]}'

Docker for Reproducible Deployment

Docker packages your app and all its dependencies into a container that runs identically everywhere. This is the standard for deploying ML models.

Dockerfile

DOCKERFILE

FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy app and model
COPY app.py model_pipeline.pkl ./

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

terminal

BASH

# Build the image
docker build -t ml-api .

# Run it locally
docker run -p 8000:8000 ml-api

# Test it
curl http://localhost:8000/

Deploy to Railway

Railway deploys Docker containers with one command. Free tier is sufficient for demos and prototypes. You get a public URL immediately.

terminal

BASH

# Install Railway CLI
brew install railway

# Login and init
railway login
railway init

# Deploy (Railway detects Dockerfile automatically)
railway up
# Deploying... done!
# https://ml-api-production.up.railway.app

# Test the public URL
curl https://your-url.up.railway.app/
# {"status":"ok","model":"RandomForest v1.0",...}

Model file size: If your .pkl file is large (over 100MB), store it in cloud storage (S3, R2) and download it on startup instead of baking it into the Docker image. Use the huggingface_hub library for large model weights.

Day 5 Checkpoint

Before moving on, make sure you can answer these without looking:

What is the core concept introduced in this lesson, and why does it matter?
What problem does Deploy solve that simpler approaches cannot?
Can you trace through the main code example in this lesson and explain each step?
What are the most common mistakes made when first learning this concept?
How would you explain today’s topic to a colleague who has never seen it before?

Deploy an ML Model as an API

Today’s Objective

Wrap the Model in FastAPI

Add Health Check and Model Info Endpoints

Docker for Reproducible Deployment

Deploy to Railway

Supporting Resources

Go deeper with these references.

Day 5 Checkpoint