Estimated reading time: 6 minutes

Deploying a PyTorch model on Vertex AI

Deploying a model on involves several steps. Here’s a breakdown:

1. Prerequisites:

  • Trained Model: You have a trained PyTorch model (house_price_model.pth).
  • Preprocessor: You’ve saved the preprocessor (e.g., as a pickle file) used to transform your data.
  • Google Project: You have a Google Cloud Project.
  • Vertex AI Enabled: The Vertex AI API is enabled in your project.
  • Google Cloud Storage (GCS) Bucket: You have a GCS bucket to store your model artifacts and serving code.
  • Serving Container: A Docker container that serves your model.

2. Steps

Here’s a conceptual outline with code snippets using the Vertex AI SDK:

2.1 Upload Model Artifacts

First, upload your trained model (house_price_model.pth) and preprocessor to your GCS bucket.

from google.cloud import storage
import os
import pickle

# Configuration
PROJECT_ID = "your-project-id"  # Replace with your  project ID
BUCKET_NAME = "your-bucket-name"  # Replace with your GCS bucket name
REGION = "us-central1"  # Or your desired region
MODEL_DIR = "house_price_model"  # Directory in GCS to store model artifacts

# Create a GCS client
storage_client = storage.Client(project=PROJECT_ID)
bucket = storage_client.bucket(BUCKET_NAME)

# Upload the model
model_blob = bucket.blob(os.path.join(MODEL_DIR, "house_price_model.pth"))
model_blob.upload_from_filename("house_price_model.pth")  # Local path to your model

# Upload the preprocessor
preprocessor_blob = bucket.blob(os.path.join(MODEL_DIR, "preprocessor.pkl"))
with open("preprocessor.pkl", "rb") as f:  # Local path to your preprocessor
    preprocessor_blob.upload_from_file(f)

print(f"Model and preprocessor uploaded to gs://{BUCKET_NAME}/{MODEL_DIR}/")

2.2 Create a Serving Container

Since you’re using PyTorch, you’ll need a custom serving container. This container will:

  • Have the necessary PyTorch dependencies.
  • Load your model and preprocessor.
  • Define a prediction function that:
    • Receives the input data.
    • Preprocesses the data using the loaded preprocessor.
    • Passes the preprocessed data to your PyTorch model.
    • Returns the prediction.

Here’s a Dockerfile example:

# Use a PyTorch base 
FROM pytorch/pytorch:1.10.0-cuda11.3-cudnn8-runtime

# Install other dependencies
RUN pip install scikit-learn

# Copy model artifacts and serving script
COPY model /model
WORKDIR /model

# Expose the serving port
EXPOSE 8080

# Command to start the serving server (e.g., using gunicorn)
CMD ["gunicorn", "--bind", "0.0.0.0:8080", "app:app", "--workers", "1", "--threads", "1"]

Here’s an example app.py (Flask application) that serves your model:

from flask import Flask, request, jsonify
import torch
import joblib  # For loading the preprocessor
import numpy as np
import 
import logging
from google.cloud import storage

app = Flask(__name__)
#GCS Configuration
PROJECT_ID = "your-project-id"  # Replace with your GCP project ID
BUCKET_NAME = "your-bucket-name"  # Replace with your GCS bucket name
MODEL_DIR = "house_price_model"

def download_from_gcs(bucket_name, source_blob_name, destination_file_name):
    """Downloads a blob from the bucket."""
    storage_client = storage.Client(project=PROJECT_ID)
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(source_blob_name)

    blob.download_to_filename(destination_file_name)

    print(f"Blob {source_blob_name} downloaded to {destination_file_name}.")


# Download model and preprocessor from GCS
download_from_gcs(BUCKET_NAME, f"{MODEL_DIR}/house_price_model.pth", "house_price_model.pth")
download_from_gcs(BUCKET_NAME, f"{MODEL_DIR}/preprocessor.pkl", "preprocessor.pkl")
# Load model and preprocessor
try:
    model = torch.load("house_price_model.pth")
    model.eval()  # Set the model to inference mode
    preprocessor = joblib.load("preprocessor.pkl")
    logging.info("Model and preprocessor loaded successfully")
except Exception as e:
    logging.error(f"Error loading model or preprocessor: {e}")
    raise

def preprocess_input(data):
    """Preprocesses the input data using the loaded preprocessor.

    Args:
        data: A JSON object containing the input data.

    Returns:
        A NumPy array of the preprocessed data.
    """
    try:
        # Convert the JSON data to a pandas DataFrame
        input_df = pd.DataFrame([data])

        # Preprocess the input DataFrame
        processed_data = preprocessor.transform(input_df)

        # Convert to numpy array
        return processed_data
    except Exception as e:
        logging.error(f"Error during preprocessing: {e}")
        raise

@app.route("/predict", methods=["POST"])
def predict():
    """Endpoint for making predictions."""
    if request.method == "POST":
        try:
            data = request.get_json(force=True)  # Get the JSON data from the request

            # Log the request data
            logging.info(f"Received data: {data}")
            # Preprocess the input data
            input_data = preprocess_input(data)

            # Convert the NumPy array to a PyTorch 
            input_tensor = torch.tensor(input_data, dtype=torch.float32)

            # Make the prediction
            with torch.no_grad():
                prediction = model(input_tensor)

            # Convert the prediction to a Python list
            output = prediction.numpy().tolist()
            logging.info(f"Prediction: {output}")
            return jsonify(output)
        except Exception as e:
            error_message = f"Error: {e}"
            logging.error(error_message)
            return jsonify({"error": error_message}), 500
    else:
        return "This endpoint only accepts POST requests", 405

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080, debug=True)

Build and push the container to Google Container Registry (GCR) or Artifact Registry:

docker build -t gcr.io/your-project-id/house-price-prediction:v1 .  # Build the image
docker push gcr.io/your-project-id/house-price-prediction:v1  # Push the image

2.3 Create a Vertex AI Model Resource

from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=REGION)

# GCR image URI
serving_container_image_uri = "gcr.io/your-project-id/house-price-prediction:v1"  # Replace

model = aiplatform.Model.upload(
    display_name="house-price-prediction-model",
    artifact_uri=f"gs://{BUCKET_NAME}/{MODEL_DIR}",  # GCS path to model artifacts
    serving_container_image_uri=serving_container_image_uri,
)

print(f"Model resource name: {model.resource_name}")

2.4 Create a Vertex AI Endpoint and Deploy the Model

endpoint = aiplatform.Endpoint.create(
    display_name="house-price-prediction-endpoint",
    location=REGION,
)

model_deployed = endpoint.deploy(
    model=model,
    traffic_split={"0": 100},
    deployed_model_display_name="house-price-prediction-deployed-model",
    machine_type="n1-standard-4",  # Or your desired machine type
)

print(f"Endpoint resource name: {endpoint.resource_name}")
print(f"Deployed model: {model_deployed.id}")

3. Make Predictions

Now you can send requests to your endpoint:

import json

# Sample data for a single house prediction
sample_data = {
    "Size_LivingArea_SqFt": 2000,
    "Size_Lot_SqFt": 8000,
    "Size_TotalArea_SqFt": 2800,
    "Rooms_Total": 7,
    "Bedrooms": 3,
    "Bathrooms_Full": 2,
    "Bathrooms_Half": 1,
    "Basement_Area_SqFt": 800,
    "Basement_Finished": 1,
    "Garage_Cars": 2,
    "Fireplaces": 1,
    "Porch_Area_SqFt": 100,
    "Year_Built": 2000,
    "Year_Remodeled": 2010,
    "Condition_Overall": 7,
    "Quality_Overall": 7,
    "Building_Type": "House",
    "House_Style": "Ranch",
    "Foundation_Type": "Slab",
    "Roof_Material": "Composition Shingle",
    "Exterior_Material": "Brick",
    "Heating_Type": "Forced Air",
    "Cooling_Type": "Central AC",
    "Kitchen_Quality": "Good",
    "Bathroom_Quality": "Good",
    "Fireplace_Quality": "Average",
    "Basement_Quality": "Average",
    "Stories": 1,
    "Floor_Material": "Hardwood",
    "Neighborhood": "Bentonville Central",
    "Proximity_Schools_Miles": 0.5,
    "Proximity_Parks_Miles": 1.2,
    "Proximity_PublicTransport_Miles": 0.8,
    "Proximity_Shopping_Miles": 1.5,
    "Proximity_Hospitals_Miles": 2.0,
    "Safety_CrimeRate_Index": 65,
    "Environmental_NoiseLevel_dB": 45,
    "Environmental_AirQuality_Index": 35,
    "Flood_Zone": "No",
    "View": "None",
    "Time_of_Sale": "2024-08",
    "Interest_Rate": 6.2,
    "Inflation_Rate": 3.5,
    "Unemployment_Rate": 4.2,
    "Housing_Inventory": 0.05,
    "Economic_Growth_Rate": 2.5,
}


# Get the endpoint
endpoint = aiplatform.Endpoint(endpoint_name=endpoint.resource_name)

# Make the prediction
response = endpoint.predict(instances=[sample_data])
predictions = response.predictions

print(f"Prediction: {predictions}")

Agentic AI (45) AI Agent (35) airflow (6) Algorithm (35) Algorithms (88) apache (57) apex (5) API (135) Automation (67) Autonomous (60) auto scaling (5) AWS (73) aws bedrock (1) Azure (47) BigQuery (22) bigtable (2) blockchain (3) Career (7) Chatbot (23) cloud (143) cosmosdb (3) cpu (45) cuda (14) Cybersecurity (19) database (138) Databricks (25) Data structure (22) Design (113) dynamodb (10) ELK (2) embeddings (39) emr (3) flink (12) gcp (28) Generative AI (28) gpu (25) graph (49) graph database (15) graphql (4) image (50) indexing (33) interview (7) java (43) json (79) Kafka (31) LLM (59) LLMs (55) Mcp (6) monitoring (128) Monolith (6) mulesoft (4) N8n (9) Networking (16) NLU (5) node.js (16) Nodejs (6) nosql (29) Optimization (91) performance (193) Platform (121) Platforms (96) postgres (5) productivity (31) programming (54) pseudo code (1) python (110) pytorch (22) Q&A (2) RAG (65) rasa (5) rdbms (7) ReactJS (1) realtime (2) redis (16) Restful (6) rust (3) salesforce (15) Spark (39) sql (70) tensor (11) time series (17) tips (14) tricks (29) use cases (93) vector (60) vector db (9) Vertex AI (23) Workflow (67)