aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Tech

Best Practices For Managing Vertex Pipelines Code

  • aster_cloud
  • December 10, 2022
  • 5 minute read
Organizations are increasingly using machine learning pipelines to streamline and scale their ML workflows. However, managing these pipelines can be challenging when an organization has multiple ML projects and pipelines at different stages of development. To solve this, we need a way to build upon DevOps concepts and apply them to this ML-specific problem. In this post, we’ll share some best practices on how to manage the codebase for your ML pipelines.
The guidance we’re sharing is based on our work with top Google Cloud customers and partners. We’ll provide a few best practices based on pipeline implementation patterns we’ve seen, but we recognize that every company’s solution will depend on many distinct factors. As a result, we don’t aim to provide a prescriptive approach. With that, let’s dive in and see how you can manage the development lifecycle of your ML pipelines.

Managing pipelines code

For any software system, developers need to be able to experiment and iterate on their code, while maintaining the stability of the production system. Using DevOps best practices, systems should be rigorously tested before being deployed, and deployments automated as much as possible. ML pipelines are no exception.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

The typical process of executing an ML pipeline in Vertex AI looks like the following:

  1. Write your pipeline code in Python, using either the Kubeflow Pipelines or TFX DSL (domain-specific language)
  2. Compile your pipeline definition to JSON format using the KFP or TFX library
  3. Submit your compiled pipeline definition to the Vertex AI API to be executed immediately

How can we effectively package up these steps into a reliable production system, while giving ML practitioners the capabilities they need to experiment and iterate on their pipeline development?

Step 1: Writing pipeline code

As with any software system, you will want to use a version control system (such as git) to manage your source code. There are a couple of other aspects you may like to consider:

Read More  Red Hat Marketplace Aims To Accelerate Open Hybrid Cloud Innovation With Certified Software Solutions Ready To Run On Any Cloud

Code reuse

Kubeflow Pipelines are inherently modular, and you can use this to your advantage by reusing these components to accelerate the development of your ML pipelines. Be sure to check out all the existing components in the Google Cloud library and KFP library.

If you create custom KFP components, be sure to share them with your organization, perhaps by moving them to another repository where they can be versioned and referenced easily. Or, even better, contribute them to the open source community! Both the Google Cloud libraries and the Kubeflow Pipelines project welcome contributions of new or improved pipeline components.

Testing

As for any production system, you should set up automated testing to give you confidence in your system, particularly when you come to make changes later. Run unit tests for your custom components using a CI pipeline whenever you open a Pull Request (PR). Running an end-to-end test of your ML pipeline can be very time-consuming, so we don’t recommend that you set these tests to run every time you open a PR (or push a subsequent commit to an open PR). Instead, require manual approval to run these to run on an open PR, or alternatively run them only when you merge your code to be deployed into a dedicated test environment.

Step 2: Compiling your pipeline

As you would with other software systems, use a CI/CD pipeline (in Google Cloud Build, for example) to compile your ML pipelines, using the KFP or TFX library as appropriate. Once you have compiled your ML pipelines, you should publish those compiled pipelines into your environment (test/production). Since the Vertex AI SDK allows you to reference compiled pipelines that are stored in Google Cloud Storage (GCS), this is a great place to publish your compiled pipelines at the end of your CD pipeline. Alternatively, publish your compiled pipeline to Artifact Registry as a Vertex AI Pipeline template.

It’s also worth compiling your ML pipelines as part of your Pull Request checks (CI) – they are quick to compile so it’s an easy way to check for any syntax errors in your pipelines.

Read More  Google Cloud Managed Service For Prometheus Is Now Generally Available

Step 3: Submit your compiled pipeline to the Vertex AI API

To submit your ML pipeline to be executed by Vertex, you will need to use the Google Cloud Vertex AI SDK (Python). As we want to execute ML pipelines that have been compiled as part of our CI/CD, you will need to separate your Python ML pipeline and compilation code from your “triggering” code that uses the Vertex AI SDK.

You may like your “triggering” code to be run on a fixed schedule (e.g. if you want to retrain your ML model every week), or perhaps instead in response to certain events (e.g. on the arrival of new data into BigQuery). Both Cloud Build and Cloud Functions will let you do this, with benefits to both approaches. You may like to use Cloud Build if you are already using Cloud Build for your CI/CD pipelines, however you will need to build a container yourself containing your “triggering” code. Using a Cloud Function, you can just deploy the code itself and GCP will take care of packaging it into a Cloud Function.

Both can be triggered using a fixed schedule (Cloud Scheduler + Pub/Sub), or triggered from a Pub/Sub event. Cloud Build can provide additional flexibility for event-based triggers, as you can interpret the Pub/Sub events using variable substitution in your Cloud Build triggers, rather than needing to interpret the Pub/Sub events in your Python code. In this way you can set up different Cloud Build triggers to kick off your ML pipeline in response to different events using the same Python code.

If you just want to schedule your Vertex AI pipelines, you can also use Datatonic’s open-source Terraform module to create Cloud Scheduler jobs that don’t require the use of a Cloud Function or other “triggering” code.

Read More  John Lewis Partnership Accelerates Technology Transformation With £100M Agreement With Google Cloud

An example architecture for triggering Vertex AI Pipelines on a schedule using Cloud Scheduler, Pub/Sub, and a Cloud Function.

Introducing Vertex AI Quickstart Templates

In partnership with Google’s Vertex AI product team, Datatonic has developed an open-source template for taking your AI use cases to production with Vertex AI Pipelines. It incorporates:

  • Example ML pipelines for training and batch scoring using XGBoost and Tensorflow frameworks (more frameworks to follow!)
  • CI/CD pipelines (using Google Cloud Build) for running unit tests of KFP components, end-to-end pipeline tests, compiling and publishing ML pipelines into your environment
  • Pipeline triggering code that can be easily deployed as a Google Cloud Function
  • Example code for an Infrastructure-as-Code deployment using Terraform
  • Make scripts to help accelerate the development cycle

The template can act as the starting point for your codebase to take a new ML use case from POC to production. Learn how Vodafone is using the templates to help slash the time from POC to production from 5 months to 4 weeks for their hundreds of Data Scientists across 13+ countries. To get started, check out the repository on GitHub and follow the instructions in the README.

If you’re new to Vertex AI and would like to learn more about it, check out the following resources to get started:

  • Vertex AI overview
  • Vertex AI Pipelines documentation
  • Intro to Vertex AI Pipelines codelab
  • Vertex AI Pipelines & ML Metadata codelab

Finally, we’d love your feedback. For feedback on Vertex AI, check out the Vertex AI support page. For feedback on the pipeline templates, please file an issue in the GitHub repository. And if you’ve got comments on this blog post, feel free to reach us.

 


Special thanks to Sara Robinson for sharing this great opportunity.


 

 

By: Ivan Nardini (Customer Engineer) and Jonny Browning (Principal MLOps Engineer at Datatonic)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster_cloud

Related Topics
  • Best Practice
  • Datatonic
  • Google Cloud
  • Machine Learning
  • Tutorials
  • Vertex AI
You May Also Like
View Post
  • Gears
  • Tech

Apple introduces the advanced new Apple Watch Series 9

  • September 12, 2023
Travel
View Post
  • Gears
  • Tech

7 Essential Items To Bring During Your Travel

  • July 31, 2023
Parliament Hall by Frederick Koberl
View Post
  • Tech

Inside International Institutions And Their Hierarchy

  • July 17, 2023
barbenheimer
View Post
  • Tech

Barbenheimer: Barbies, Bombs And The Cinematic Showdown Of The Summer Blockbusters

  • July 13, 2023
x.ai - Understand the universe
View Post
  • Tech

Elon Musk’s New xAI Company Launches To ‘Understand The True Nature Of The Universe’

  • July 12, 2023
Amazon Prime Day 2023 Tech Deals
View Post
  • Gears
  • Tech

Amazon Prime Day 2023 –  Uncover the Best Deals on Computers and Electronics

  • July 12, 2023
Ant, Strength, Wood and Sun
View Post
  • Tech

The Paradox Of Prosperity. Can Wealth Really Be Inherited?

  • July 6, 2023
Chart | eToro
View Post
  • Tech

Discover the Power of Social Trading on eToro and Earn Rewards!

  • June 27, 2023

Stay Connected!
LATEST
  • 1
    Combining AI With A Trusted Data Approach On IBM Power To Fuel Business Outcomes
    • September 21, 2023
  • 2
    Start Your Ubuntu Confidential VM With Intel® TDX On Google Cloud
    • September 20, 2023
  • Microsoft and Adobe 3
    Microsoft And Adobe Partner To Deliver Cost Savings And Business Benefits
    • September 20, 2023
  • 4
    Huawei Connect 2023: Accelerating Intelligence For Shared Success
    • September 20, 2023
  • 5
    Huawei Releases Data Center 2030, Leading Innovation and Development of New Data Centers
    • September 20, 2023
  • Penguin 6
    How To Find And Fix Broken Packages On Linux
    • September 19, 2023
  • Volkswagen 7
    Volkswagen Races Toward Next-Gen Automotive Manufacturing Leadership With Google Cloud And T-Systems
    • September 19, 2023
  • 8
    VMware Scales Multi-Cloud Security With Workforce Identity Federation
    • September 18, 2023
  • Intel Innovation 9
    Intel Innovation 2023
    • September 15, 2023
  • Private 10
    A Comeback For Private Clouds
    • September 14, 2023
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Microsoft And Oracle Expand Partnership To Deliver Oracle Database Services On Oracle Cloud Infrastructure In Microsoft Azure
    • September 14, 2023
  • 2
    Real-Time Ubuntu Is Now Available In AWS Marketplace
    • September 12, 2023
  • 3
    IBM Brings Watsonx To ESPN Fantasy Football With New Waiver Grades And Trade Grades
    • September 13, 2023
  • 4
    NASA Shares Unidentified Anomalous Phenomena Independent Study Report
    • September 14, 2023
  • 5
    Introducing OpenAI Dublin
    • September 14, 2023
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.