aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering
  • Work & Jobs

Introducing Vertical Autoscaling In Streaming Dataflow Prime Jobs

  • aster.cloud
  • September 19, 2022
  • 5 minute read

Dataflow has provided a number of capabilities to improve utilization and efficiency by automatically provisioning and scaling resources for your job. The following are some of the examples:

  • Horizontal Autoscaling that automatically scales the number of workers.
  • Streaming Engine, which decouples storage from the workers. This also gives workers access to unbounded storage and more responsive Horizontal Autoscaling.
  • Dynamic work rebalancing that splits work across available workers based on work progress.

Building on this solid and differentiated foundation, we recently launched Dataflow Prime, a new next generation serverless, no-ops, autotuning platform for your data processing needs on Google Cloud. Dataflow Prime introduces a new industry-first resource optimization technology, Vertical Autoscaling, which automatically scales worker memory in order to remove the need to do manual tuning of worker configuration. With Vertical Autoscaling, Dataflow Prime automatically determines the right worker configuration for your job.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Current user challenges

With Dataflow, you write your data processing logic using the Apache Beam SDK or Dataflow Templates and let Dataflow handle the optimization, execution and scalability of your pipeline. While in many cases your pipeline executes well, for some cases you have to manually select the right resources like memory for best performance and cost. For many users, this is a time consuming trial and error process and a single worker configuration is unlikely to be optimal for the pipeline. In addition, there was the risk of static configurations becoming outdated when data processing requirements changed.

We have designed Vertical Autoscaling to solve these challenges and allow you to focus on your application and business logic.

How does Vertical Autoscaling work?

Vertical Autoscaling observes out of memory (OOM) events and memory usage of your streaming pipeline over time and triggers memory scaling based on this. This makes your pipeline resilient to out of memory errors without any manual intervention.

Read More  How To Deploy The Google Cloud Ops Agent With Ansible

With Vertical Autoscaling, if there is high memory utilization, all workers in your job are replaced with workers with larger memory capacity. In the following illustration we see that workers 1, 2, and 3 have high memory utilization and a capacity of 4GB. After Vertical Autoscaling, all workers have a memory capacity of 5 GB, which gives them sufficient memory headroom.

 

This process is iterative, and it can take up to a few minutes to replace the workers.

Similarly, if there is low memory usage, Vertical Autoscaling downscales the workers to lower memory capacity, thus improving utilization and saving cost. It relies on historical usage data per pipeline to know when it is safe to scale down, prioritizing pipeline stability. You may observe a long period of time (12 hours or more) where no downscaling occurs after a spike in memory utilization. Vertical Autoscaling takes a conservative approach to downscaling in order to keep pipelines processing with minimal disruption.

Things to know about Vertical Autoscaling

How does Vertical Autoscaling impact my job?

  • As the workers are replaced, you may observe a temporary drop in throughput, but impact to a running pipeline (i.e. backlog, watermark, throughput metrics) will not be significantly different from a Horizontal Autoscaling event.
  • Horizontal Autoscaling is disabled during and up to 10 minutes after Vertical Autoscaling.
  • As with horizontal scaling, some backlog may accumulate during the scaling process, if this backlog cannot be cleared in a timely fashion, horizontal scaling may occur to clear that backlog.

 

Does Vertical Autoscaling remove all OOMs?

 

  • It is important to note that Vertical Autoscaling is designed to react to OOMs and high memory usage, but cannot necessarily prevent OOMs, especially if there is a fast spike in memory usage on a worker which results in an OOM.
  • When OOMs occur, Vertical Autoscaling automatically detects them and resize worker memory to address issues. As a consequence, you will see a few OOM errors in the worker logs but these can be ignored if those are followed by upscale events.
  • It is also important to note that some OOMs may be happen as a result  of downscale events where Dataflow reduced the amount of memory because of underutilization. In such cases, Dataflow will automatically upsize if it detects OOMs. Again, it is safe to ignore these OOM messages if they are followed by upscale events.
  • If the OOM messages are not followed by an upscale event, you may have hit the memory scaling limit. In this case you may need to optimize your pipeline’s memory usage or use resource hints.
  • If you see OOM messages continuously and have not observed a job message indicating you have hit the memory scaling limit, please contact the support team. Note that if OOMs occur very rarely (e.g. once every few hours per pipeline), Vertical Autoscaling may choose to not scale up the workers to avoid introducing additional disruption.
Read More  Georgia Public Library Service Improves Staff And Patron Experience With ChromeOS

How to enable Vertical Autoscaling ?

Vertical Autoscaling is only available for Dataflow Prime jobs. See the instructions to launch Dataflow Prime jobs and how to enable Vertical Autoscaling.

You don’t have to make any code changes to run your existing Apache Beam pipeline on Dataflow Prime. Additionally, you don’t have to specify the worker type when launching a Dataflow Prime job. However if you want to control the initial worker’s resource configuration you can use resource hints.

You can confirm if Vertical Autoscaling is running on your pipeline by looking for the following job log:

  • Vertical Autoscaling is enabled. This pipeline is receiving recommendations for resources allocated per worker. 

How to monitor Vertical Autoscaling ?

Whenever Vertical Autoscaling updates workers with more or less memory, the following job logs are generated in Cloud Logging:

  • Vertical Autoscaling is enabled. This pipeline is receiving recommendations for resources allocated per worker.
  • Vertical Autoscaling update triggered to change per worker memory limit for pool from X GiB to Y GiB. 

You can read more about these logs in this section.

Additionally you can visually monitor Vertical Autoscaling by looking at worker capacity under the ‘Max worker memory utilization’ chart in Dataflow metrics UI.

Following is a chart for a Dataflow worker that Vertically Autoscaled. We see three Vertical Autoscaling events for this job. Whenever the memory used was close to memory capacity, Vertical Autoscaling triggered and scaled up the worker memory capacity.

 

Summary

  1. Try Vertical Autoscaling for your streaming jobs on Dataflow Prime for improved resource optimization and cost savings.
  2. There is no code change required to run your existing Apache Beam pipeline on Dataflow Prime.
  3. There is no additional cost associated with using Vertical Autoscaling. Dataflow Prime jobs continue to be billed based on the resources it consumes.
Read More  Accelerating ML With Vertex AI: From Retail And Finance To Manufacturing And Automotive

 

 

By: Zeeshan Khan (Product Manager) and Zach Zimmerman (Software Engineer)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Apache Beam
  • Data Analytics
  • Dataflow
  • Google Cloud
You May Also Like
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Understand how Windows Server 2025 PAYG licensing works
    • May 20, 2025
  • By the numbers: How upskilling fills the IT skills gap
    • May 21, 2025
  • 3
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 4
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 5
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.