CNCF Volcano 1.6.0 is now available with new features such as elastic job management, dynamic scheduling and rescheduling based on actual resource utilization, and MPI job plugin.

Volcano v1.6.0 is now available

Volcano is the  first cloud native batch computing project in CNCF. It was open sourced at Shanghai KubeCon in June 2019 and accepted as a CNCF project in April 2020. In April 2022, Volcano was promoted to a CNCF incubation project. By now, more than 400 global developers have committed code to the project. The community is growing popularity among developers, partners, and users.

Key Features

Scheduling for elastic job

This feature, working with Volcano Jobs or PyTorch Jobs, accelerates AI training and big data analytics and reduces costs by using spot instances on the cloud.

The number of replicas allowed for an elastic job falls within [min, max]. min corresponds to minAvailable of the job, and max indicates the number of replicas of the job. The elastic scheduling module preferentially allocates resources to the minAvailable pods to ensure that their minimum resource requests are met.

Resources, when idle, will be allocated by the scheduler to elastic pods to accelerate computing. However, when the cluster is resource-starved, the scheduler preferentially preempts the resources of elastic pods, which triggers scale-in. The scheduler also balances resource allocation based on priorities. For example, a high-priority job can preempt resources of an elastic pod of a low-priority job.

Job 1-1

Documentation: https://github.com/volcano-sh/volcano/blob/master/docs/design/elastic-scheduler.md

Issue:

https://github.com/volcano-sh/volcano/issues/1876

Dynamic scheduling

The current scheduling mechanism, based on resource request and allocation, may cause unbalanced node resource utilization. For example, a pod may be scheduled to a node with a extremely high resource usage and cause a node exception, while there are some other nodes in the cluster that are not heavily used

In version 1.6.0, Volcano collaborates with Prometheus to make scheduling decisions. Prometheus collects data about cluster node resource use, and Volcano uses this data to balance node resource usage as much as possible. You can also configure the limits of CPUs and memory of each node. This prevents node exceptions caused by pods using too many resources.

Example scheduling policy:

actions<span class="token operator">:</span> <span class="token string">"enqueue, allocate, backfill"</span>  
tiers<span class="token operator">:</span>
  <span class="token operator">-</span> plugins<span class="token operator">:</span>
      <span class="token operator">-</span> name<span class="token operator">:</span> priority
      <span class="token operator">-</span> name<span class="token operator">:</span> gang
      <span class="token operator">-</span> name<span class="token operator">:</span> conformance
      <span class="token operator">-</span> name<span class="token operator">:</span> usage  # usage based scheduling plugin
        arguments<span class="token operator">:</span>
          thresholds<span class="token operator">:</span>
            CPUUsageAvg<span class="token punctuation">.</span><span class="token number">5</span>m<span class="token operator">:</span> <span class="token number">90</span> # The node whose average usage in <span class="token number">5</span> minute is higher than <span class="token number">90</span><span class="token operator">%</span> will be filtered in predicating stage
            MEMUsageAvg<span class="token punctuation">.</span><span class="token number">5</span>m<span class="token operator">:</span> <span class="token number">80</span> # The node whose average usage in <span class="token number">5</span> minute is higher than <span class="token number">80</span><span class="token operator">%</span> will be filtered in predicating stage
  <span class="token operator">-</span> plugins<span class="token operator">:</span>
      <span class="token operator">-</span> name<span class="token operator">:</span> overcommit
      <span class="token operator">-</span> name<span class="token operator">:</span> drf
      <span class="token operator">-</span> name<span class="token operator">:</span> predicates
      <span class="token operator">-</span> name<span class="token operator">:</span> proportion
      <span class="token operator">-</span> name<span class="token operator">:</span> nodeorder
      <span class="token operator">-</span> name<span class="token operator">:</span> binpack
metrics<span class="token operator">:</span>                         # Metrics Server<span class="token operator">-</span>related configuration
  address<span class="token operator">:</span> http<span class="token operator">:</span><span class="token comment">//192.168.0.10:9090  # (mandatory) Prometheus server address</span>
  interval<span class="token operator">:</span> <span class="token number">30</span>s                    # <span class="token punctuation">(</span>optional<span class="token punctuation">)</span> The scheduler pulls metrics from Prometheus with <span class="token keyword">this</span> interval<span class="token punctuation">.</span> <span class="token number">5</span>s by <span class="token keyword">default</span><span class="token punctuation">.</span>

Documentation: https://github.com/volcano-sh/volcano/blob/master/docs/design/usage-based-scheduling.md

Issue:

https://github.com/volcano-sh/volcano/issues/1777

Rescheduling

Improper scheduling policies and dynamic job lifecycles lead to unbalanced node resource utilization. In version 1.6.0, Volcano allows you to add rescheduling policies based on the actual resource utilization or custom metrics. Pods will be evicted from some high-load nodes to low-load nodes, and the resource utilization of all nodes will be periodically checked.

Rescheduling further balances the loads of each node and improves the cluster resource utilization.

## Configuration Option actions: “enqueue, allocate, backfill, shuffle”  ## Add ‘shuffle’ at the end of the actions tiers:

<span class="token operator">-</span> plugins<span class="token operator">:</span>
      <span class="token operator">-</span> name<span class="token operator">:</span> priority
      <span class="token operator">-</span> name<span class="token operator">:</span> gang
      <span class="token operator">-</span> name<span class="token operator">:</span> conformance
      <span class="token operator">-</span> name<span class="token operator">:</span> rescheduling       ## Rescheduling plugin
        arguments<span class="token operator">:</span>
          interval<span class="token operator">:</span> <span class="token number">5</span>m           ## <span class="token punctuation">(</span>optional<span class="token punctuation">)</span> The strategies will be called in <span class="token keyword">this</span> duration periodically<span class="token punctuation">.</span> <span class="token number">5</span> minutes by <span class="token keyword">default</span><span class="token punctuation">.</span> 
          strategies<span class="token operator">:</span>            ## <span class="token punctuation">(</span>mandatory<span class="token punctuation">)</span> The strategies work in order<span class="token punctuation">.</span>
            <span class="token operator">-</span> name<span class="token operator">:</span> offlineOnly
            <span class="token operator">-</span> name<span class="token operator">:</span> lowPriorityFirst
            <span class="token operator">-</span> name<span class="token operator">:</span> lowNodeUtilization
              params<span class="token operator">:</span>
                thresholds<span class="token operator">:</span>
                  <span class="token string">"cpu"</span> <span class="token operator">:</span> <span class="token number">20</span>
                  <span class="token string">"memory"</span><span class="token operator">:</span> <span class="token number">20</span>
                  <span class="token string">"pods"</span><span class="token operator">:</span> <span class="token number">20</span>
                targetThresholds<span class="token operator">:</span>
                  <span class="token string">"cpu"</span> <span class="token operator">:</span> <span class="token number">50</span>
                  <span class="token string">"memory"</span><span class="token operator">:</span> <span class="token number">50</span>
                  <span class="token string">"pods"</span><span class="token operator">:</span> <span class="token number">50</span>
          queueSelector<span class="token operator">:</span>         ## <span class="token punctuation">(</span>optional<span class="token punctuation">)</span> Select workloads in specified queues as potential evictees<span class="token punctuation">.</span> All queues by <span class="token keyword">default</span><span class="token punctuation">.</span>
            <span class="token operator">-</span> <span class="token keyword">default</span>
            <span class="token operator">-</span> test<span class="token operator">-</span>queue
          labelSelector<span class="token operator">:</span>         ## <span class="token punctuation">(</span>optional<span class="token punctuation">)</span> Select workloads with specified labels as potential evictees<span class="token punctuation">.</span> All labels by <span class="token keyword">default</span><span class="token punctuation">.</span>
            business<span class="token operator">:</span> offline
            team<span class="token operator">:</span> test
  <span class="token operator">-</span> plugins<span class="token operator">:</span>
      <span class="token operator">-</span> name<span class="token operator">:</span> overcommit
      <span class="token operator">-</span> name<span class="token operator">:</span> drf
      <span class="token operator">-</span> name<span class="token operator">:</span> predicates
      <span class="token operator">-</span> name<span class="token operator">:</span> proportion
      <span class="token operator">-</span> name<span class="token operator">:</span> nodeorder
      <span class="token operator">-</span> name<span class="token operator">:</span> binpack

Documentationhttps://github.com/volcano-sh/volcano/blob/master/docs/design/rescheduling.md

Issue:

https://github.com/volcano-sh/volcano/issues/1777

MPI plugin

You can use Volcano Jobs to run MPI jobs. Volcano Job build-in plugins such as svc, env, and ssh automatically configure password-free communications and environment variable injection for the masters and workers of MPI jobs.

The new version of Volcano further eases your running of MPI jobs by providing the MPI plugin. No more worries about the shell syntax, the communications between masters and workers, or manual SSH authentication. You can start an MPI job in a simple and graceful manner.

Example configuration:

apiVersion<span class="token operator">:</span> batch<span class="token punctuation">.</span>volcano<span class="token punctuation">.</span>sh<span class="token operator">/</span>v1alpha1
kind<span class="token operator">:</span> Job
metadata<span class="token operator">:</span>
  name<span class="token operator">:</span> lm<span class="token operator">-</span>mpi<span class="token operator">-</span>job
spec<span class="token operator">:</span>
  minAvailable<span class="token operator">:</span> <span class="token number">1</span>
  schedulerName<span class="token operator">:</span> volcano
  plugins<span class="token operator">:</span>
    mpi<span class="token operator">:</span> <span class="token punctuation">[</span><span class="token string">"--master=mpimaster"</span><span class="token punctuation">,</span><span class="token string">"--worker=mpiworker"</span><span class="token punctuation">,</span><span class="token string">"--port=22"</span><span class="token punctuation">]</span>  ## MPI plugin <span class="token keyword">register</span>
  tasks<span class="token operator">:</span>
    <span class="token operator">-</span> replicas<span class="token operator">:</span> <span class="token number">1</span>
      name<span class="token operator">:</span> mpimaster
      policies<span class="token operator">:</span>
        <span class="token operator">-</span> event<span class="token operator">:</span> TaskCompleted
          action<span class="token operator">:</span> CompleteJob
      <span class="token keyword">template</span><span class="token operator">:</span>
        spec<span class="token operator">:</span>
          containers<span class="token operator">:</span>
            <span class="token operator">-</span> command<span class="token operator">:</span>
                <span class="token operator">-</span> <span class="token operator">/</span>bin<span class="token operator">/</span>sh
                <span class="token operator">-</span> <span class="token operator">-</span>c
                <span class="token operator">-</span> <span class="token operator">|</span>
                  mkdir <span class="token operator">-</span>p <span class="token operator">/</span>var<span class="token operator">/</span>run<span class="token operator">/</span>sshd<span class="token punctuation">;</span> <span class="token operator">/</span>usr<span class="token operator">/</span>sbin<span class="token operator">/</span>sshd<span class="token punctuation">;</span>
                  mpiexec <span class="token operator">--</span>allow<span class="token operator">-</span>run<span class="token operator">-</span>as<span class="token operator">-</span>root <span class="token operator">--</span>host $<span class="token punctuation">{</span>MPI_HOST<span class="token punctuation">}</span> <span class="token operator">-</span>np <span class="token number">2</span> mpi_hello_world<span class="token punctuation">;</span>
              image<span class="token operator">:</span> volcanosh<span class="token operator">/</span>example<span class="token operator">-</span>mpi<span class="token operator">:</span><span class="token number">0.0</span><span class="token punctuation">.</span><span class="token number">1</span>
              name<span class="token operator">:</span> mpimaster
              workingDir<span class="token operator">:</span> <span class="token operator">/</span>home
          restartPolicy<span class="token operator">:</span> OnFailure
    <span class="token operator">-</span> replicas<span class="token operator">:</span> <span class="token number">2</span>
      name<span class="token operator">:</span> mpiworker
      <span class="token keyword">template</span><span class="token operator">:</span>
        spec<span class="token operator">:</span>
          containers<span class="token operator">:</span>
            <span class="token operator">-</span> command<span class="token operator">:</span>
                <span class="token operator">-</span> <span class="token operator">/</span>bin<span class="token operator">/</span>sh
                <span class="token operator">-</span> <span class="token operator">-</span>c
                <span class="token operator">-</span> <span class="token operator">|</span>
                  mkdir <span class="token operator">-</span>p <span class="token operator">/</span>var<span class="token operator">/</span>run<span class="token operator">/</span>sshd<span class="token punctuation">;</span> <span class="token operator">/</span>usr<span class="token operator">/</span>sbin<span class="token operator">/</span>sshd <span class="token operator">-</span>D<span class="token punctuation">;</span>
              image<span class="token operator">:</span> volcanosh<span class="token operator">/</span>example<span class="token operator">-</span>mpi<span class="token operator">:</span><span class="token number">0.0</span><span class="token punctuation">.</span><span class="token number">1</span>
              name<span class="token operator">:</span> mpiworker
              workingDir<span class="token operator">:</span> <span class="token operator">/</span>home
          restartPolicy<span class="token operator">:</span> OnFailure

Documentationhttps://github.com/volcano-sh/volcano/blob/master/docs/design/distributed-framework-plugins.md

Issue:

https://github.com/volcano-sh/volcano/pull/2194

Links:

Release note: https://github.com/volcano-sh/volcano/releases/tag/v1.6.0

Branch: https://github.com/volcano-sh/volcano/tree/release-1.6

About Volcano

Website: https://volcano.sh

Github: https://github.com/volcano-sh/volcano

Volcano is designed for high-performance batch computing such as AI, big data, gene sequencing, and rendering jobs. The project has got more than 2400 Stars and 550 Forks on GitHub. 26,000 developers around the world join the community. Contributing enterprises include Huawei, AWS, Baidu, Tencent, JD.com, and Xiaohongshu.

Volcano supports mainstream computing frameworks, including Spark, Flink, TensorFlow, PyTorch, Argo, MindSpore, PaddlePaddle, Kubeflow, MPI, Horovod, MXNet, and KubeGene. A comprehensive, robust ecosystem has been developed.

 

 

Project post by Volcano project maintainers
Source CNCF

Previous Cloud Native: Why Bother, Its Benefits, And Its Greatest Pitfall
Next Fixing Font Padding In Compose Text