Deploying a PyTorch model on Vertex AI

Deploying a PyTorch model on involves several steps. Here’s a breakdown:

1. Prerequisites:

  • Trained Model: You have a trained PyTorch model (house_price_model.pth).
  • Preprocessor: You’ve saved the preprocessor (e.g., as a pickle file) used to transform your data.
  • Google Cloud Project: You have a Google Cloud Project.
  • Vertex Enabled: The Vertex AI API is enabled in your project.
  • Google Cloud Storage (GCS) Bucket: You have a GCS bucket to store your model artifacts and serving code.
  • Serving Container: A Docker container that serves your model.

2. Steps

Here’s a conceptual outline with code snippets using the Vertex AI SDK:

2.1 Upload Model Artifacts

First, upload your trained model (house_price_model.pth) and preprocessor to your GCS bucket.

from google.cloud import storage
import os
import pickle

# Configuration
PROJECT_ID = "your-project-id"  # Replace with your GCP project ID
BUCKET_NAME = "your-bucket-name"  # Replace with your GCS bucket name
REGION = "us-central1"  # Or your desired region
MODEL_DIR = "house_price_model"  # Directory in GCS to store model artifacts

# Create a GCS client
storage_client = storage.Client(project=PROJECT_ID)
bucket = storage_client.bucket(BUCKET_NAME)

# Upload the model
model_blob = bucket.blob(os.path.join(MODEL_DIR, "house_price_model.pth"))
model_blob.upload_from_filename("house_price_model.pth")  # Local path to your model

# Upload the preprocessor
preprocessor_blob = bucket.blob(os.path.join(MODEL_DIR, "preprocessor.pkl"))
with open("preprocessor.pkl", "rb") as f:  # Local path to your preprocessor
    preprocessor_blob.upload_from_file(f)

print(f"Model and preprocessor uploaded to gs://{BUCKET_NAME}/{MODEL_DIR}/")

2.2 Create a Serving Container

Since you’re using PyTorch, you’ll need a custom serving container. This container will:

  • Have the necessary PyTorch dependencies.
  • Load your model and preprocessor.
  • Define a prediction function that:
    • Receives the input data.
    • Preprocesses the data using the loaded preprocessor.
    • Passes the preprocessed data to your PyTorch model.
    • Returns the prediction.

Here’s a Dockerfile example:

# Use a PyTorch base image
FROM pytorch/pytorch:1.10.0-cuda11.3-cudnn8-runtime

# Install other dependencies
RUN pip install scikit-learn

# Copy model artifacts and serving script
COPY model /model
WORKDIR /model

# Expose the serving port
EXPOSE 8080

# Command to start the serving server (e.g., using gunicorn)
CMD ["gunicorn", "--bind", "0.0.0.0:8080", "app:app", "--workers", "1", "--threads", "1"]

Here’s an example app.py (Flask application) that serves your model:

from flask import Flask, request, jsonify
import torch
import joblib  # For loading the preprocessor
import numpy as np
import json
import logging
from google.cloud import storage

app = Flask(__name__)
#GCS Configuration
PROJECT_ID = "your-project-id"  # Replace with your GCP project ID
BUCKET_NAME = "your-bucket-name"  # Replace with your GCS bucket name
MODEL_DIR = "house_price_model"

def download_from_gcs(bucket_name, source_blob_name, destination_file_name):
    """Downloads a blob from the bucket."""
    storage_client = storage.Client(project=PROJECT_ID)
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(source_blob_name)

    blob.download_to_filename(destination_file_name)

    print(f"Blob {source_blob_name} downloaded to {destination_file_name}.")


# Download model and preprocessor from GCS
download_from_gcs(BUCKET_NAME, f"{MODEL_DIR}/house_price_model.pth", "house_price_model.pth")
download_from_gcs(BUCKET_NAME, f"{MODEL_DIR}/preprocessor.pkl", "preprocessor.pkl")
# Load model and preprocessor
try:
    model = torch.load("house_price_model.pth")
    model.eval()  # Set the model to inference mode
    preprocessor = joblib.load("preprocessor.pkl")
    logging.info("Model and preprocessor loaded successfully")
except Exception as e:
    logging.error(f"Error loading model or preprocessor: {e}")
    raise

def preprocess_input(data):
    """Preprocesses the input data using the loaded preprocessor.

    Args:
        data: A JSON object containing the input data.

    Returns:
        A NumPy array of the preprocessed data.
    """
    try:
        # Convert the JSON data to a pandas DataFrame
        input_df = pd.DataFrame([data])

        # Preprocess the input DataFrame
        processed_data = preprocessor.transform(input_df)

        # Convert to numpy array
        return processed_data
    except Exception as e:
        logging.error(f"Error during preprocessing: {e}")
        raise

@app.route("/predict", methods=["POST"])
def predict():
    """Endpoint for making predictions."""
    if request.method == "POST":
        try:
            data = request.get_json(force=True)  # Get the JSON data from the request

            # Log the request data
            logging.info(f"Received data: {data}")
            # Preprocess the input data
            input_data = preprocess_input(data)

            # Convert the NumPy array to a PyTorch tensor
            input_tensor = torch.tensor(input_data, dtype=torch.float32)

            # Make the prediction
            with torch.no_grad():
                prediction = model(input_tensor)

            # Convert the prediction to a Python list
            output = prediction.numpy().tolist()
            logging.info(f"Prediction: {output}")
            return jsonify(output)
        except Exception as e:
            error_message = f"Error: {e}"
            logging.error(error_message)
            return jsonify({"error": error_message}), 500
    else:
        return "This endpoint only accepts POST requests", 405

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080, debug=True)

Build and push the container to Google Container Registry (GCR) or Artifact Registry:

docker build -t gcr.io/your-project-id/house-price-prediction:v1 .  # Build the image
docker push gcr.io/your-project-id/house-price-prediction:v1  # Push the image

2.3 Create a Vertex AI Model Resource

from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=REGION)

# GCR image URI
serving_container_image_uri = "gcr.io/your-project-id/house-price-prediction:v1"  # Replace

model = aiplatform.Model.upload(
    display_name="house-price-prediction-model",
    artifact_uri=f"gs://{BUCKET_NAME}/{MODEL_DIR}",  # GCS path to model artifacts
    serving_container_image_uri=serving_container_image_uri,
)

print(f"Model resource name: {model.resource_name}")

2.4 Create a Vertex AI Endpoint and Deploy the Model

endpoint = aiplatform.Endpoint.create(
    display_name="house-price-prediction-endpoint",
    location=REGION,
)

model_deployed = endpoint.deploy(
    model=model,
    traffic_split={"0": 100},
    deployed_model_display_name="house-price-prediction-deployed-model",
    machine_type="n1-standard-4",  # Or your desired machine type
)

print(f"Endpoint resource name: {endpoint.resource_name}")
print(f"Deployed model: {model_deployed.id}")

3. Make Predictions

Now you can send requests to your endpoint:

import json

# Sample data for a single house prediction
sample_data = {
    "Size_LivingArea_SqFt": 2000,
    "Size_Lot_SqFt": 8000,
    "Size_TotalArea_SqFt": 2800,
    "Rooms_Total": 7,
    "Bedrooms": 3,
    "Bathrooms_Full": 2,
    "Bathrooms_Half": 1,
    "Basement_Area_SqFt": 800,
    "Basement_Finished": 1,
    "Garage_Cars": 2,
    "Fireplaces": 1,
    "Porch_Area_SqFt": 100,
    "Year_Built": 2000,
    "Year_Remodeled": 2010,
    "Condition_Overall": 7,
    "Quality_Overall": 7,
    "Building_Type": "House",
    "House_Style": "Ranch",
    "Foundation_Type": "Slab",
    "Roof_Material": "Composition Shingle",
    "Exterior_Material": "Brick",
    "Heating_Type": "Forced Air",
    "Cooling_Type": "Central AC",
    "Kitchen_Quality": "Good",
    "Bathroom_Quality": "Good",
    "Fireplace_Quality": "Average",
    "Basement_Quality": "Average",
    "Stories": 1,
    "Floor_Material": "Hardwood",
    "Neighborhood": "Bentonville Central",
    "Proximity_Schools_Miles": 0.5,
    "Proximity_Parks_Miles": 1.2,
    "Proximity_PublicTransport_Miles": 0.8,
    "Proximity_Shopping_Miles": 1.5,
    "Proximity_Hospitals_Miles": 2.0,
    "Safety_CrimeRate_Index": 65,
    "Environmental_NoiseLevel_dB": 45,
    "Environmental_AirQuality_Index": 35,
    "Flood_Zone": "No",
    "View": "None",
    "Time_of_Sale": "2024-08",
    "Interest_Rate": 6.2,
    "Inflation_Rate": 3.5,
    "Unemployment_Rate": 4.2,
    "Housing_Inventory": 0.05,
    "Economic_Growth_Rate": 2.5,
}


# Get the endpoint
endpoint = aiplatform.Endpoint(endpoint_name=endpoint.resource_name)

# Make the prediction
response = endpoint.predict(instances=[sample_data])
predictions = response.predictions

print(f"Prediction: {predictions}")