aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Data
  • Design
  • Engineering

Improving Model Quality At Scale With Vertex AI Model Evaluation

  • aster.cloud
  • December 31, 2022
  • 6 minute read
Typically, data scientists retrain models at regular intervals to keep them fresh and relevant. This practice may turn out to be costly if the model is trained too often or inefficient if the model training isn’t frequent enough to serve the business. Ideally, data scientists prefer to continuously evaluate the models and intentionally retrain models when the model performance starts to degrade. At scale, continuous model evaluation would require a standard and efficient evaluation process and system.In fact, after training a model, data scientists and ML engineers use an offline dataset of historical examples from the production environment to evaluate model performance across several model experiments. If the evaluation metrics meet some predefined thresholds, data scientists and ML engineers can proceed to deploy the model, either manually or by using a ML pipeline. This process serves to find the best model (approach and configuration) to go to production. Figure 1 illustrates a basic workflow in which data scientists and ML engineers gather new data and retrain the model at regular intervals. 

Figure 1:Standard Model Evaluation workflow(click to enlarge)

Depending on use cases, continuous model evaluation might be required. Once deployed, the model is monitored in production by detecting skew and drift both in features and target distributions. When a change in those distributions is detected, because the model could start underperforming, you need to evaluate it by using production data. Figure 2 illustrates an advanced workflow where teams gather new data and continuously evaluate the model. Based on the outcome of continuous evaluation, the model is retrained with new data. 

Figure 2:Continuous Model Evaluation workflow(click to enlarge)


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

At scale, building this continuous evaluation system can be a challenging task. There are several factors that contribute to making it difficult including getting access to production data, provisioning computational resources, standardizing the model evaluation process, and guaranteeing its reproducibility. With the intent to simplify and accelerate the entire process of defining and running ML evaluations, Vertex AI Model Evaluation enables you to iteratively assess and compare model performance at scale.With Vertex AI Model Evaluation, you define a test dataset, a model, and an evaluation configuration as inputs and it will return model performance metrics whether you are training your model using your notebook, running a training job, or an ML pipeline on Vertex AI.Vertex AI Model Evaluation is integrated with the following products:

  • Vertex AI Model Registry which provides a new view to get access to different evaluation jobs and the resulting metrics they produce after the model training job completes.
  • Model Builder SDK which introduces a new evaluate method to get classification, regression, and forecasting metrics for a model trained locally.
  • Managed Pipelines with a new evaluation component to generate and visualize metrics results within the Vertex AI Pipelines Console.
Read More  Using Pacemaker For SAP High Availability On Google Cloud - Part 1

Now that you know the new features of Vertex AI Model Evaluation, let’s see how you can leverage them to improve your model quality at scale.

Evaluate performances of different models in Vertex AI Model Registry

As the decision maker who has to promote the model to production, you need to govern the model launching process.

To release the model, you need to easily retrieve, visualize, and compare the offline and online performance and explainability metrics of the trained models.

Thanks to the integration between Vertex AI Model Registry and Vertex AI Model evaluation, you can now view all historical evaluations of each model (BQML, AutoML and custom models). For each model version, the Vertex AI Model Registry console shows classification, regression, and forecasting metrics depending on the type of model.

 

Figure 3:Compare Model evaluation across model versions(click to enlarge)

You can also compare those metrics across different model versions to identify and explain the best model version to deploy as an online experiment or directly to production. 

Figure 4:Compare Model evaluation view(click to enlarge)

Train and evaluate your model in your notebook using Model Builder SDK

During the model development phase, as a data scientist, you experiment with different models and parameters in your notebook. Then, you calculate measurements such as accuracy, precision, and recall, and build performance plots like confusion matrix and ROC on a validation/test dataset. Those indicators allow you and your team to review the candidate model’s performance and compare it with other model(s) to ultimately decide whether the model is ready to be formalized in a component of the production pipeline.

Read More  AI Building Blocks - The Critical Role Of Datasets & Algorithms

The new Vertex AI Model Builder SDK allows you to calculate those metrics and plots by leveraging Vertex AI. By providing the testing dataset, the model and the evaluation configuration, you can submit an evaluation job. After the evaluation task is completed, you are able to retrieve and visualize the results of the evaluation locally across different models and compare them side-by-side to decide whether or not to deploy it as an online experiment or directly into production.

Below is an example of how to run an evaluation job for a classification model.

# Upload the model as Vertex AI model resource
vertex_model = aiplatform.Model.upload(
   display_name="your-model",
   artifact_uri="gcs://your-bucket/your-model",
   serving_container_image_uri="your-serving-image",
)
# Run an evaluation job
eval_job = vertex_model.evaluate(
   gcs_source_uris=["gcs://your-bucket/your-evaluate-data.csv"],
   prediction_type="classification",
   class_labels=["class_1", "class_2"],
   target_column_name="target_column",
   prediction_label_column="prediction",
   prediction_score_column="prediction",
   experiment="your-experiment-name"
)

Notice that all parameters and metrics of the evaluation job are tracked as part of the experiment to guarantee its reproducibility.The Model Evaluation in the Vertex SDK is in Experimental release. To get access, please fill out this form.

Operationalize model evaluation with Vertex AI Pipelines

Once the model has been validated, ML engineers can proceed to deploy the model, either manually or from a pipeline. When the production pipeline is required, it has to include a formalized evaluation pipeline component that produces the model quality metrics. In this way, the model evaluation process can be replicated at scale and the evaluation metrics can be logged into downstream systems such as Experiments tracking and Model Registry services. At the end, decision makers can use those metrics to validate models and determine which model will be deployed.

Currently, building and maintaining an evaluation component requires time and resources. Instead you want to focus on solving new challenges and building new ML products. To simplify and accelerate the process of defining and running evaluations within ML pipelines, we are excited to announce the release of new operators for Vertex AI Pipelines that help make it easier to operationalize model evaluation in a Vertex AI Pipeline. Indeed those components automatically generate and track evaluation results to facilitate easy retrieval and model comparison. Below you have the main evaluation operators:

  1. GetVertexModelOp to initialize the Vertex Model Artifact to evaluate.
  2. EvaluationDataSamplerOp to create an input dataset randomly with a specified size for computing Vertex XAI feature attributions.
  3. TargetFieldDataRemoverOp to remove the target field from the input dataset for supporting custom models for Vertex Batch Prediction
  4. ModelBatchPredictOp to run a Google Cloud Vertex BatchPredictionJob and generate predictions for model evaluation
  5. ModelEvaluationClassificationOp to compute evaluation metrics on a trained model’s batch prediction results
  6. ModelEvaluationFeatureAttributionOp to generate feature attribution on a trained model’s batch explanation results.
  7. ModelImportEvaluationOp to store a model evaluation artifact as a resource of an existing Vertex model with ModelService.
Read More  Building A Sustainable Agricultural Supply Chain On Google Cloud

With these components, you can define a training pipeline that starts from a model resource and generates the evaluation metrics and the feature attributions from a given dataset. Below you have an example of a Vertex AI pipeline using those components in combination with a Vertex AI AutoML model in a classification scenario.

 

Figure 5:Vertex AI Model Evaluation Pipeline(click to enlarge)

Conclusion

Vertex AI Model Evaluation enables customers to accelerate and operationalize model performance analysis and validation steps required in an end-to-end MLOps workflow. Thanks to its native integration with other Vertex AI services, Vertex AI Model Evaluation allows you to run model evaluation jobs (measure model performance on a test dataset) regardless of which Vertex service used to train the model (AutoML, Managed Pipelines, Custom Training, etc.) and store and visualize the evaluation results across multiple models in Vertex AI Model Registry. With these capabilities, Vertex AI Model Evaluation enables users to decide which model(s) can progress to online testing or be put into production, and once in production, when models need to be retrained.

Now it’s your turn. Check out notebooks in the official Github repo and the resources below to get started with Vertex AI Model Evaluation. And remember…Always have fun!

Want to learn more?

Documentation:

  • https://cloud.google.com/vertex-ai/docs/evaluation/introduction

Samples:

  • https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/main/notebooks/official/model_evaluation

Thanks to Jing Qi, Kevin Naughton, Marton Balint, Sara Robinson, Soheila Zangeneh, Karen Lin and all Vertex AI Model Evaluation team for their support and feedback.

 

By: Ivan Nardini (Customer Engineer) and Safwan Samla (Product Manager)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Artificial Intelligence
  • Google Cloud
  • Machine Learning
  • Python
  • Vertex AI
You May Also Like
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
Getting things done makes her feel amazing
View Post
  • Computing
  • Data
  • Featured
  • Learning
  • Tech
  • Technology

Nurturing Minds in the Digital Revolution

  • April 25, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Understand how Windows Server 2025 PAYG licensing works
    • May 20, 2025
  • By the numbers: How upskilling fills the IT skills gap
    • May 21, 2025
  • 3
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 4
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 5
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.