aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Computing
  • Tech

Three Reasons Why You Need Volcano

  • aster.cloud
  • February 13, 2021
  • 7 minute read

Volcano is a Kubernetes native batch scheduling system. This open-source project is optimized for compute-intensive workloads, and is especially useful in sectors such as AI, big data, genomics, and rendering. Mainstream computing frameworks in these sectors can easily connect to Volcano to integrate high-performance job scheduling, heterogeneous chip management, and job management.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Why do you need Volcano?

  • Group Scheduling

The default scheduler of Kubernetes schedules containers one by one. This can waste resources and result in resource bottlenecks, causing containers to deadlock in scenarios where a group of containers need to be scheduled all at the same time, for example, in AI training jobs or big data applications.

Suppose an AI application consisting of 2 ps containers and 4 worker containers needs to be scheduled onto limited resources. When the default scheduler tries to schedule the last worker container, if there are no resources available, the scheduling fails. The job hangs as the application cannot run without that last worker container. Meanwhile, resources occupied by the already scheduled containers produce nothing.

This is where Volcano comes in. It ensures that a gang of related containers can be scheduled at the same time. If for any reason it is not possible to deploy all of the containers in a gang, Volcano will not schedule that gang. In practice, it is not uncommon for us to deploy a set of internally dependent containers onto limited resources. Volcano is vital in these cases as gang scheduling eliminates potential deadlocks resulting by insufficient resources. Volcano significantly improves resource utilization for heavily loaded clusters.

Gang scheduling is based on container groups, or “jobs” as they are called in the code. With gang scheduling, the algorithm checks each job to see whether the entire job can be scheduled. The containers in each group are called “tasks”. If the number of tasks that can be scheduled exceeds a specified threshold, the job will be scheduled to nodes. This process of scheduling is called “bind nodes” in the code.

  • Automatic Optimization of Resource Allocation

Containers are scheduled to the nodes that can supply the CPU, memory, GPU, and other resources necessary for the job. Typically, multiple nodes will be available. Each node will have different resources available for new workloads. Volcano analyzes the expected resource utilization for different scheduling plans, and selects the most appropriate node for the job.

  • Support for a Range of Advanced Scheduling Scenarios

Volcano offers a diverse set of scheduling algorithms, such as priority, domain resource fairness (DRF), and binpack, which means you can more easily handle diverse services requirements. For example, you may want to ensure DR and disruption isolation when deploying your applications. With Volcano, you can easily deploy containers running the same application but on different nodes, and with each node only having one pod. In another scenario, to ensure that certain applications do not compete for resources, you may want to avoid deploying them on the same node. Volcano can help you do that.

Read More  Why Is Your Multicloud So Slow?

How does Volcano manage all of this? Let’s take a closer look at some of the scheduling algorithms Volcano offers.

The DRF algorithm is used by both YARN and Mesos but not Kubernetes. DRF prioritizes jobs requiring fewer resources so more jobs can be performed. Smaller jobs will not be starved for resources occupied by larger jobs. DRF treats each job, such as an AI training job or a big data analysis job, as a single unit for scheduling purposes. 

The binpack algorithm tries to ensure that any node that is occupied is occupied as fully as possible. It avoids scheduling empty nodes in favor of occupied nodes, and the more fully a node is occupied, the more likely it is to be scheduled. This algorithm calculates the resource utilization of each node. It concentrates your workloads in the cluster, which works better with the auto-scaling of Kubernetes clusters. With binpack, each container is treated as an individual scheduling unit.

The queue (proportion) algorithm is offered in YARN but not Kubernetes. Volcano makes it up for Kubernetes. This algorithm is used to control the overall resource allocation of a cluster. For example, if two teams in a company are sharing a pool of compute resources, queue can be used to specify that team A can use up to 60% of the total cluster resources, and team B up to 40%. The algorithm prioritizes the queue that has the lowest expected resource utilization.

Volcano has many scheduling algorithm plugins and conflicts may occur, so weight is introduced to each plugin. It ensures that a final scheduling decision can be made. For example, in Kubernetes scheduling, there are two stages, the predict stage and the priority stage. In the first stage, nodes that fail to meet requirements are filtered out. In the second stage, nodes are scored. The NodeOrder algorithm scores all nodes in the second stage according to the different weight of each algorithm plugin.

Volcano originated from kube-batch, a project initially intended to address issues with gang scheduling in Kubernetes. Later, as AI and big data services started demanding stronger and more versatile scheduling in Kubernetes, kube-batch was combined with various scenario-specific practices to provide enhanced scheduling capabilities. It was then renamed Volcano to reflect its power and a glowing future.

For more details about Volcano, visit https://volcano.sh/.

Volcano is a Kubernetes native batch scheduling system. This open-source project is optimized for compute-intensive workloads, and is especially useful in sectors such as AI, big data, genomics, and rendering. Mainstream computing frameworks in these sectors can easily connect to Volcano to integrate high-performance job scheduling, heterogeneous chip management, and job management.

Read More  Microsoft Introduces Cloud-Native Application Platform

Why do you need Volcano?

  • Group Scheduling

The default scheduler of Kubernetes schedules containers one by one. This can waste resources and result in resource bottlenecks, causing containers to deadlock in scenarios where a group of containers need to be scheduled all at the same time, for example, in AI training jobs or big data applications.

Suppose an AI application consisting of 2 ps containers and 4 worker containers needs to be scheduled onto limited resources. When the default scheduler tries to schedule the last worker container, if there are no resources available, the scheduling fails. The job hangs as the application cannot run without that last worker container. Meanwhile, resources occupied by the already scheduled containers produce nothing.

This is where Volcano comes in. It ensures that a gang of related containers can be scheduled at the same time. If for any reason it is not possible to deploy all of the containers in a gang, Volcano will not schedule that gang. In practice, it is not uncommon for us to deploy a set of internally dependent containers onto limited resources. Volcano is vital in these cases as gang scheduling eliminates potential deadlocks resulting by insufficient resources. Volcano significantly improves resource utilization for heavily loaded clusters.

Gang scheduling is based on container groups, or “jobs” as they are called in the code. With gang scheduling, the algorithm checks each job to see whether the entire job can be scheduled. The containers in each group are called “tasks”. If the number of tasks that can be scheduled exceeds a specified threshold, the job will be scheduled to nodes. This process of scheduling is called “bind nodes” in the code.

  • Automatic Optimization of Resource Allocation

Containers are scheduled to the nodes that can supply the CPU, memory, GPU, and other resources necessary for the job. Typically, multiple nodes will be available. Each node will have different resources available for new workloads. Volcano analyzes the expected resource utilization for different scheduling plans, and selects the most appropriate node for the job.

  • Support for a Range of Advanced Scheduling Scenarios

Volcano offers a diverse set of scheduling algorithms, such as priority, domain resource fairness (DRF), and binpack, which means you can more easily handle diverse services requirements. For example, you may want to ensure DR and disruption isolation when deploying your applications. With Volcano, you can easily deploy containers running the same application but on different nodes, and with each node only having one pod. In another scenario, to ensure that certain applications do not compete for resources, you may want to avoid deploying them on the same node. Volcano can help you do that.

Read More  VPN Usage Surges During COVID-19 Crisis

How does Volcano manage all of this? Let’s take a closer look at some of the scheduling algorithms Volcano offers.

The DRF algorithm is used by both YARN and Mesos but not Kubernetes. DRF prioritizes jobs requiring fewer resources so more jobs can be performed. Smaller jobs will not be starved for resources occupied by larger jobs. DRF treats each job, such as an AI training job or a big data analysis job, as a single unit for scheduling purposes. 

The binpack algorithm tries to ensure that any node that is occupied is occupied as fully as possible. It avoids scheduling empty nodes in favor of occupied nodes, and the more fully a node is occupied, the more likely it is to be scheduled. This algorithm calculates the resource utilization of each node. It concentrates your workloads in the cluster, which works better with the auto-scaling of Kubernetes clusters. With binpack, each container is treated as an individual scheduling unit.

The queue (proportion) algorithm is offered in YARN but not Kubernetes. Volcano makes it up for Kubernetes. This algorithm is used to control the overall resource allocation of a cluster. For example, if two teams in a company are sharing a pool of compute resources, queue can be used to specify that team A can use up to 60% of the total cluster resources, and team B up to 40%. The algorithm prioritizes the queue that has the lowest expected resource utilization.

Volcano has many scheduling algorithm plugins and conflicts may occur, so weight is introduced to each plugin. It ensures that a final scheduling decision can be made. For example, in Kubernetes scheduling, there are two stages, the predict stage and the priority stage. In the first stage, nodes that fail to meet requirements are filtered out. In the second stage, nodes are scored. The NodeOrder algorithm scores all nodes in the second stage according to the different weight of each algorithm plugin.

Volcano originated from kube-batch, a project initially intended to address issues with gang scheduling in Kubernetes. Later, as AI and big data services started demanding stronger and more versatile scheduling in Kubernetes, kube-batch was combined with various scenario-specific practices to provide enhanced scheduling capabilities. It was then renamed Volcano to reflect its power and a glowing future.

For more details about Volcano, visit https://volcano.sh/.

 

By Volcano Community Maintainer
Source Cloud Native Computing Foundation


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Algorithms
  • Cloud Native Computing Foundation
  • CNCF
  • Kubernetes
  • Volcano
You May Also Like
View Post
  • Computing
  • Multi-Cloud
  • Technology

Reliance on US tech providers is making IT leaders skittish

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Examine the 4 types of edge computing, with examples

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

AI and private cloud: 2 lessons from Dell Tech World 2025

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

TD Synnex named as UK distributor for Cohesity

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Weigh these 6 enterprise advantages of storage as a service

  • May 28, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Pulsant targets partner diversity with new IaaS solution

  • May 23, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Growing AI workloads are causing hybrid cloud headaches

  • May 23, 2025

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Understand how Windows Server 2025 PAYG licensing works
    • May 20, 2025
  • By the numbers: How upskilling fills the IT skills gap
    • May 21, 2025
  • 3
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 4
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 5
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.