Tutorial

Deploying ML Models with FastAPI and Docker

A practical guide to wrapping a trained machine learning model in a FastAPI REST API and containerizing it with Docker for reproducible, production-ready deployment.

Mohammed Gamal Mohammed Gamal
· 2026-03-16 · 7 min read · Beginner
MLOps FastAPI Docker Python Deployment

Overview

You've trained a great model — now what? This tutorial shows you how to serve it as a REST API using FastAPI and package everything in a Docker container for easy deployment.


Prerequisites

  • A trained model (we'll use a scikit-learn classifier as an example)
  • Python 3.10+
  • Docker installed
pip install fastapi uvicorn scikit-learn joblib

Step 1: Save Your Model

import joblib
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
model = RandomForestClassifier().fit(X, y)
joblib.dump(model, 'model.joblib')

Step 2: Create the FastAPI App

# app.py
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

app = FastAPI(title='ML Model API')
model = joblib.load('model.joblib')

class PredictRequest(BaseModel):
    features: list[float]

class PredictResponse(BaseModel):
    prediction: int
    confidence: float

@app.post('/predict', response_model=PredictResponse)
def predict(req: PredictRequest):
    X = np.array(req.features).reshape(1, -1)
    pred = model.predict(X)[0]
    proba = model.predict_proba(X).max()
    return PredictResponse(prediction=int(pred), confidence=float(proba))

@app.get('/health')
def health():
    return {'status': 'ok'}

Run locally:

uvicorn app:app --reload

Step 3: Write the Dockerfile

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY model.joblib .
COPY app.py .

EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Step 4: Build and Run

docker build -t ml-api .
docker run -p 8000:8000 ml-api

Test with curl:

curl -X POST http://localhost:8000/predict \
  -H 'Content-Type: application/json' \
  -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

Step 5: Production Considerations

  • Add input validation and error handling
  • Use multi-stage Docker builds to reduce image size
  • Add logging and monitoring (Prometheus metrics)
  • Set up CI/CD to auto-build and deploy on push
  • Use GPU-enabled base images for deep learning models
  • Consider model versioning with a registry

Next Steps

  • Add authentication with API keys
  • Deploy to AWS ECS, GCP Cloud Run, or Kubernetes
  • Add batch prediction endpoints
  • Implement A/B testing between model versions

Continue learning

Browse All Tutorials