aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Data
  • Engineering

Simplify Model Serving With Custom Prediction Routines On Vertex AI

  • aster.cloud
  • September 2, 2022
  • 6 minute read

The data received at serving time is rarely in the format your model expects. Numerical columns need to be normalized, features created, image bytes decoded, input values validated. Transforming the data can be as important as the prediction itself. That’s why we’re excited to announce custom prediction routines on Vertex AI, which simplify the process of writing pre and post processing code.

With custom prediction routines, you can provide your data transformations as Python code, and behind the scenes Vertex AI SDK will build a custom container that you can test locally and deploy to the cloud.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Understanding custom prediction routines

The Vertex AI pre-built containers handle prediction requests by performing the prediction operation of the machine learning framework. Prior to custom prediction routines, if you wanted to preprocess the input before the prediction is performed, or postprocess the model’s prediction before returning the result, you would need to build a custom container from scratch.

Building a custom serving container requires writing an HTTP server that wraps the trained model, translates HTTP requests into model inputs, and translates model outputs into responses. You can see an example here showing how to build a model server with FastAPI.

With custom prediction routines, Vertex AI provides the serving-related components for you, so that you can focus on your model and data transformations.

The predictor

The predictor class is responsible for the ML-related logic in a prediction request: loading the model, getting predictions, and applying custom preprocessing and postprocessing. To write custom prediction logic, you’ll subclass the Vertex AI Predictor interface. In most cases, customizing the predictor is all you’ll require, but check out this notebook if you’d like to see an example of customizing the request handler.

This release of custom prediction routines comes with reusable XGBoost and Sklearn predictors, but if you need to use a different framework you can create your own by subclassing the base predictor.

You can see an example predictor implementation below, specifically the reusable Sklearn predictor. This is all the code you would need to write in order to build this custom model server.

 

import joblib
import numpy as np

from google.cloud.aiplatform.utils import prediction_utils
from google.cloud.aiplatform.prediction.predictor import Predictor

class SklearnPredictor(Predictor):
   """Default Predictor implementation for Sklearn models."""

   def __init__(self):
       return

   def load(self, artifacts_uri: str):
       prediction_utils.download_model_artifacts(artifacts_uri)
       self._model = joblib.load("model.joblib")

   def preprocess(self, prediction_input: dict) -> np.ndarray:
       instances = prediction_input["instances"]
       return np.asarray(instances)

   def predict(self, instances: np.ndarray) -> np.ndarray:
       return self._model.predict(instances)

   def postprocess(self, prediction_results: np.ndarray) -> dict:
       return {"predictions": prediction_results.tolist()}

 

Read More  What’s New In Azure Data, AI, & Digital Applications: Modernize Your Data Estate, Build Intelligent Apps, And Apply AI Solutions

A predictor implements four methods:

  • Load: Loads in the model artifacts, and any optional preprocessing artifacts such as an encoder you saved to a pickle file.
  • Preprocess: Performs the logic to preprocess the input data before the prediction request. By default, the preprocess method receives a dictionary which contains all the data in the request body after it has been deserialized from JSON.
  • Predict: Performs the prediction, which will look something like model.predict(instances) depending on what framework you’re using.
  • Postprocess: Postprocesses the prediction results before returning them to the end user. By default, the output of the postprocess method will be serialized into a JSON object and returned as the response body.

You can customize as many of the above methods as your use case requires. To customize, all you need to do is subclass the predictor and save your new custom predictor to a Python file.

Let’s take a deeper look at how you might customize each one of these methods.

Load

The load method is where you load in any artifacts from Cloud Storage. This includes the model, but can also include custom preprocessors.

For example, let’s say you wrote the following preprocessor to scale numerical features, and stored it as a pickle file called preprocessor.pkl in Cloud Storage.

 

class MySimpleScaler(object):
   def __init__(self):
       self._means = None
       self._stds = None

   def preprocess(self, data):
       if self._means is None:  # during training only
           self._means = np.mean(data, axis=0)

       if self._stds is None:  # during training only
           self._stds = np.std(data, axis=0)
           if not self._stds.all():
               raise ValueError("At least one column has standard deviation of 0.")
       return (data - self._means) / self._stds

 

When customizing the predictor, you would write a load method to read the pickle file, similar to the following, where artifacts_uri is the Cloud Storage path to your model and preprocessing artifacts.

 

def load(self, artifacts_uri: str):
   """Loads the preprocessor artifacts."""
   super().load(artifacts_uri)
   gcs_client = storage.Client()
   with open("preprocessor.pkl", 'wb') as preprocessor_f:
     gcs_client.download_blob_to_file(
           f"{artifacts_uri}/preprocessor.pkl", preprocessor_f
       )

   with open("preprocessor.pkl", "rb") as f:
       preprocessor = pickle.load(f)

   self._preprocessor = preprocessor

 

Preprocess

Read More  Edge Computing—Challenges And Opportunities For Enterprise Cloud Architects

The preprocess method is where you write the logic to perform any preprocessing needed for your serving data. It can be as simple as just applying the preprocessor you loaded in the load method as shown below:

 

def preprocess(self, prediction_input):
   inputs = super().preprocess(prediction_input)
   return self._preprocessor.preprocess(inputs)

 

Instead of loading in a preprocessor, you might write the preprocessing directly in the preprocess method. For example, you might need to check your inputs are in the format you expect. Let’s say your model expects the feature at index 3 to be a string in its abbreviated form. You want to check that at serving time the value for that feature is abbreviated.

 

def preprocess(self, prediction_input):
   inputs = super().preprocess(prediction_input)
   clarity_dict={"Flawless": "FL",
                 "Internally Flawless": "IF",
                 "Very Very Slightly Included": "VVS1",
                 "Very Slightly Included": "VS2",
                 "Slightly Included": "S12",
                 "Included": "I3"}
   for sample in inputs:
       if sample[3] not in clarity_dict.values():
           sample[3] = clarity_dict[sample[3]]   
   return inputs

 

There are numerous other ways you could customize the preprocessing logic. You might need to tokenize text for a language model, generate new features, or load data from an external source.

Predict

This method usually just calls model.predict, and generally doesn’t need to be customized unless you’re building your predictor from scratch instead of with a reusable predictor.

Postprocess

Sometimes the model prediction is only the first step. After you get a prediction from the model you might need to transform it to make it valuable to the end user. This might be something as simple as converting the numerical class label returned by the model to the string label as shown below.

 

def postprocess(self, prediction_results):
   label_dict = {0: 'rose',
                 1: 'daisy',
                 2: 'dandelion',
                 3: 'tulip',
                 4: 'sunflower'}
   return {"predictions": [label_dict[class_num] for class_num in prediction_results]}

 

Or you could implement additional business logic. For example, you might want to only return a prediction if the model’s confidence is above a certain threshold. If it’s below, you want the input to be sent to a human instead to double check.

 

def postprocess(self, prediction_results):
   returned_predictions = []
   for result in prediction_results:
     if result > self._confidence_threshold:
       returned_predictions.append(result)
     else:
       returned_predictions.append("confidence too low for prediction")
  return {"predictions": returned_predictions}

 

Just like with preprocessing, there are numerous ways you can postprocess your data with custom prediction routines. You might need to detokenize text for a language model, convert the model output into a more readable format for the end user, or even call a Vertex AI Matching Engine index endpoint to search for data with a similar embedding.

Read More  Extend Chrome Security Reporting To CrowdStrike Falcon LogScale

Local Testing

When you’ve written your predictor, you’ll want to save the class out to a Python file. Then you can build your image with the command below, where LOCAL_SOURCE_DIR is a local directory that contains the Python file where you saved your custom predictor.

 

from google.cloud.aiplatform.prediction import LocalModel
from src_dir.predictor import MyCustomPredictor
import os

local_model = LocalModel.build_cpr_model(
   {LOCAL_SOURCE_DIR},
   f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
   predictor=MyCustomPredictor,
   requirements_path=os.path.join(LOCAL_SOURCE_DIR, "requirements.txt"),
)

 

Once the image is built, you can test it out by deploying it to a local endpoint and then calling the predict method and passing in the request data. You’ll set artifact_uri to the path in Cloud Storage where you’ve saved your model and any artifacts needed for preprocessing or postprocessing. You can also use a local path for testing.

 

with local_model.deploy_to_local_endpoint(
   artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",
   credential_path=CREDENTIALS_FILE,
) as local_endpoint:
   predict_response = local_endpoint.predict(
       request_file=INPUT_FILE,
       headers={"Content-Type": "application/json"},
   )

 

Deploy to Vertex AI

After testing the model locally to confirm that the predictions work as expected, the next steps are to push the image to Artifact Registry, import the model to the Vertex AI Model Registry, and optionally deploy it to an endpoint if you want online predictions.

 

# push image
local_model.push_image()

# upload to registry
model = aiplatform.Model.upload(local_model=local_model,    
                               display_name=MODEL_DISPLAY_NAME,
                             artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",)


# deploy
endpoint = model.deploy(machine_type="n1-standard-4")

 

When the model has been uploaded to Vertex AI and deployed, you’ll be able to see it in the model registry. And then you can make prediction requests like you would with any other model you have deployed on Vertex AI.

 

# get prediction
endpoint.predict(instances=PREDICTION_DATA)

 

What’s next

You now know the basics of how to use custom prediction routines to help add powerful customization to your serving workflows without having to worry about model servers or building Docker containers. To get hands on experience with an end to end example, check out this codelab. It’s time to start writing some custom prediction code of your own!

 

 

By: Nikita Namjoshi (Developer Advocate) and Sam Thrasher (Software Engineer)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Artificial Intelligence
  • Coding
  • Google Cloud
  • Machine Learning
  • Python
  • Tutorials
  • Vertex AI
You May Also Like
Getting things done makes her feel amazing
View Post
  • Computing
  • Data
  • Featured
  • Learning
  • Tech
  • Technology

Nurturing Minds in the Digital Revolution

  • April 25, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.