Package your trained model into a FastAPI service, containerize it with Docker, and deploy it so anyone can call it over HTTP.
Package your trained model into a FastAPI service, containerize it with Docker, and deploy it so anyone can call it over HTTP.
A production-ready ML API that accepts JSON input, runs it through your sklearn pipeline, and returns predictions with confidence scores — deployed and accessible via a public URL.
A model sitting on your laptop is worthless. Wrap it in a REST API and it becomes a service that your frontend, other APIs, and automation tools can all use.
from fastapi import FastAPI, HTTPException from pydantic import BaseModel import joblib import numpy as np from typing import List app = FastAPI(title="Cancer Classifier API") # Load once on startup pipeline = joblib.load("model_pipeline.pkl") model = pipeline["model"] scaler = pipeline["scaler"] class PredictRequest(BaseModel): features: List[float] # 30 feature values class PredictResponse(BaseModel): prediction: int # 0=benign, 1=malignant label: str confidence: float @app.post("/predict", response_model=PredictResponse) def predict(req: PredictRequest): if len(req.features) != 30: raise HTTPException(400, "Expected 30 features") X = np.array(req.features).reshape(1, -1) X_scaled = scaler.transform(X) pred = int(model.predict(X_scaled)[0]) prob = float(model.predict_proba(X_scaled)[0, pred]) return PredictResponse( prediction=pred, label="Malignant" if pred == 1 else "Benign", confidence=prob )
Production APIs need observability endpoints. A health check lets load balancers verify the service is running. A model info endpoint documents what the API expects.
import datetime @app.get("/") def health(): return { "status": "ok", "model": "RandomForest v1.0", "timestamp": datetime.datetime.now().isoformat() } @app.get("/model-info") def model_info(): return { "algorithm": "Random Forest Classifier", "n_estimators": model.n_estimators, "n_features": model.n_features_in_, "classes": ["Benign (0)", "Malignant (1)"], "training_features": pipeline["features"] } # Run: uvicorn app:app --reload # Test: curl -X POST http://localhost:8000/predict \ # -H "Content-Type: application/json" \ # -d '{"features": [17.99,10.38,122.8,1001,0.1184,...]}'
Docker packages your app and all its dependencies into a container that runs identically everywhere. This is the standard for deploying ML models.
FROM python:3.11-slim WORKDIR /app # Install dependencies COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Copy app and model COPY app.py model_pipeline.pkl ./ EXPOSE 8000 CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
# Build the image docker build -t ml-api . # Run it locally docker run -p 8000:8000 ml-api # Test it curl http://localhost:8000/
Railway deploys Docker containers with one command. Free tier is sufficient for demos and prototypes. You get a public URL immediately.
# Install Railway CLI brew install railway # Login and init railway login railway init # Deploy (Railway detects Dockerfile automatically) railway up # Deploying... done! # https://ml-api-production.up.railway.app # Test the public URL curl https://your-url.up.railway.app/ # {"status":"ok","model":"RandomForest v1.0",...}
Model file size: If your .pkl file is large (over 100MB), store it in cloud storage (S3, R2) and download it on startup instead of baking it into the Docker image. Use the huggingface_hub library for large model weights.
Before moving on, make sure you can answer these without looking: