aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Tech

Best Practices For Managing Vertex Pipelines Code

  • aster.cloud
  • December 10, 2022
  • 5 minute read
Organizations are increasingly using machine learning pipelines to streamline and scale their ML workflows. However, managing these pipelines can be challenging when an organization has multiple ML projects and pipelines at different stages of development. To solve this, we need a way to build upon DevOps concepts and apply them to this ML-specific problem. In this post, we’ll share some best practices on how to manage the codebase for your ML pipelines.
The guidance we’re sharing is based on our work with top Google Cloud customers and partners. We’ll provide a few best practices based on pipeline implementation patterns we’ve seen, but we recognize that every company’s solution will depend on many distinct factors. As a result, we don’t aim to provide a prescriptive approach. With that, let’s dive in and see how you can manage the development lifecycle of your ML pipelines.

Managing pipelines code

For any software system, developers need to be able to experiment and iterate on their code, while maintaining the stability of the production system. Using DevOps best practices, systems should be rigorously tested before being deployed, and deployments automated as much as possible. ML pipelines are no exception.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

The typical process of executing an ML pipeline in Vertex AI looks like the following:

  1. Write your pipeline code in Python, using either the Kubeflow Pipelines or TFX DSL (domain-specific language)
  2. Compile your pipeline definition to JSON format using the KFP or TFX library
  3. Submit your compiled pipeline definition to the Vertex AI API to be executed immediately

How can we effectively package up these steps into a reliable production system, while giving ML practitioners the capabilities they need to experiment and iterate on their pipeline development?

Step 1: Writing pipeline code

As with any software system, you will want to use a version control system (such as git) to manage your source code. There are a couple of other aspects you may like to consider:

Read More  Edit Video On Linux With This Python App

Code reuse

Kubeflow Pipelines are inherently modular, and you can use this to your advantage by reusing these components to accelerate the development of your ML pipelines. Be sure to check out all the existing components in the Google Cloud library and KFP library.

If you create custom KFP components, be sure to share them with your organization, perhaps by moving them to another repository where they can be versioned and referenced easily. Or, even better, contribute them to the open source community! Both the Google Cloud libraries and the Kubeflow Pipelines project welcome contributions of new or improved pipeline components.

Testing

As for any production system, you should set up automated testing to give you confidence in your system, particularly when you come to make changes later. Run unit tests for your custom components using a CI pipeline whenever you open a Pull Request (PR). Running an end-to-end test of your ML pipeline can be very time-consuming, so we don’t recommend that you set these tests to run every time you open a PR (or push a subsequent commit to an open PR). Instead, require manual approval to run these to run on an open PR, or alternatively run them only when you merge your code to be deployed into a dedicated test environment.

Step 2: Compiling your pipeline

As you would with other software systems, use a CI/CD pipeline (in Google Cloud Build, for example) to compile your ML pipelines, using the KFP or TFX library as appropriate. Once you have compiled your ML pipelines, you should publish those compiled pipelines into your environment (test/production). Since the Vertex AI SDK allows you to reference compiled pipelines that are stored in Google Cloud Storage (GCS), this is a great place to publish your compiled pipelines at the end of your CD pipeline. Alternatively, publish your compiled pipeline to Artifact Registry as a Vertex AI Pipeline template.

It’s also worth compiling your ML pipelines as part of your Pull Request checks (CI) – they are quick to compile so it’s an easy way to check for any syntax errors in your pipelines.

Read More  Google Cloud Next 2019 | Democratizing AI For Industrial Applications

Step 3: Submit your compiled pipeline to the Vertex AI API

To submit your ML pipeline to be executed by Vertex, you will need to use the Google Cloud Vertex AI SDK (Python). As we want to execute ML pipelines that have been compiled as part of our CI/CD, you will need to separate your Python ML pipeline and compilation code from your “triggering” code that uses the Vertex AI SDK.

You may like your “triggering” code to be run on a fixed schedule (e.g. if you want to retrain your ML model every week), or perhaps instead in response to certain events (e.g. on the arrival of new data into BigQuery). Both Cloud Build and Cloud Functions will let you do this, with benefits to both approaches. You may like to use Cloud Build if you are already using Cloud Build for your CI/CD pipelines, however you will need to build a container yourself containing your “triggering” code. Using a Cloud Function, you can just deploy the code itself and GCP will take care of packaging it into a Cloud Function.

Both can be triggered using a fixed schedule (Cloud Scheduler + Pub/Sub), or triggered from a Pub/Sub event. Cloud Build can provide additional flexibility for event-based triggers, as you can interpret the Pub/Sub events using variable substitution in your Cloud Build triggers, rather than needing to interpret the Pub/Sub events in your Python code. In this way you can set up different Cloud Build triggers to kick off your ML pipeline in response to different events using the same Python code.

If you just want to schedule your Vertex AI pipelines, you can also use Datatonic’s open-source Terraform module to create Cloud Scheduler jobs that don’t require the use of a Cloud Function or other “triggering” code.

Read More  Google Cloud Next 2019 | Personalized Customer Loyalty At Scale With DSW

An example architecture for triggering Vertex AI Pipelines on a schedule using Cloud Scheduler, Pub/Sub, and a Cloud Function.

Introducing Vertex AI Quickstart Templates

In partnership with Google’s Vertex AI product team, Datatonic has developed an open-source template for taking your AI use cases to production with Vertex AI Pipelines. It incorporates:

  • Example ML pipelines for training and batch scoring using XGBoost and Tensorflow frameworks (more frameworks to follow!)
  • CI/CD pipelines (using Google Cloud Build) for running unit tests of KFP components, end-to-end pipeline tests, compiling and publishing ML pipelines into your environment
  • Pipeline triggering code that can be easily deployed as a Google Cloud Function
  • Example code for an Infrastructure-as-Code deployment using Terraform
  • Make scripts to help accelerate the development cycle

The template can act as the starting point for your codebase to take a new ML use case from POC to production. Learn how Vodafone is using the templates to help slash the time from POC to production from 5 months to 4 weeks for their hundreds of Data Scientists across 13+ countries. To get started, check out the repository on GitHub and follow the instructions in the README.

If you’re new to Vertex AI and would like to learn more about it, check out the following resources to get started:

  • Vertex AI overview
  • Vertex AI Pipelines documentation
  • Intro to Vertex AI Pipelines codelab
  • Vertex AI Pipelines & ML Metadata codelab

Finally, we’d love your feedback. For feedback on Vertex AI, check out the Vertex AI support page. For feedback on the pipeline templates, please file an issue in the GitHub repository. And if you’ve got comments on this blog post, feel free to reach us.

 


Special thanks to Sara Robinson for sharing this great opportunity.


 

 

By: Ivan Nardini (Customer Engineer) and Jonny Browning (Principal MLOps Engineer at Datatonic)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Best Practice
  • Datatonic
  • Google Cloud
  • Machine Learning
  • Tutorials
  • Vertex AI
You May Also Like
Getting things done makes her feel amazing
View Post
  • Computing
  • Data
  • Featured
  • Learning
  • Tech
  • Technology

Nurturing Minds in the Digital Revolution

  • April 25, 2025
View Post
  • Tech

Deep dive into AI with Google Cloud’s global generative AI roadshow

  • February 18, 2025
Volvo Group: Confidently ahead at CES
View Post
  • Tech

Volvo Group: Confidently ahead at CES

  • January 8, 2025
zedreviews-ces-2025-social-meta
View Post
  • Featured
  • Gears
  • Tech
  • Technology

What Not to Miss at CES 2025

  • January 6, 2025
View Post
  • Tech

IBM and Pasqal Plan to Expand Quantum-Centric Supercomputing Initiative

  • November 21, 2024
Black Friday Gifts
View Post
  • Tech

Black Friday. How to Choose the Best Gifts for Yourself and Others, Plus Our Top Recommendations.

  • November 16, 2024
zedreviews-Apple-iPhone-16-Pro-finish-lineup-240909
View Post
  • Featured
  • Gears
  • Tech
  • Technology
  • Tools

Apple debuts iPhone 16 Pro and iPhone 16 Pro Max

  • September 10, 2024
zedreviews-Apple-iPhone-16-Apple-Intelligence-240909
View Post
  • Featured
  • Gears
  • Tech
  • Technology

Apple introduces iPhone 16 and iPhone 16 Plus

  • September 10, 2024

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Understand how Windows Server 2025 PAYG licensing works
    • May 20, 2025
  • By the numbers: How upskilling fills the IT skills gap
    • May 21, 2025
  • 3
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 4
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 5
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.