Writing Your Own Scheduler With Kube-Scheduler-Simulator

As the default Kubernetes scheduler is highly configurable, in many cases we don’t have to write any code to customize the
scheduling behavior. However, people who want to learn how the scheduler works and interact with other components may try to develop their own scheduler.

In this article, I describe how to build a scheduler development environment with the help of kube-scheduler-simulator.

Strategy

  1. Use kube-scheduler-simulator, which provides a easy way to develop schedulers without preparing a real cluster
  2. Add a minimal scheduler implementation to kube-scheduler-simulator, because the default one is too flexible and thus complicated
    for beginners
  3. Modify and evaluate the scheduling algorithm

Setup kube-scheduler-simulator

Firstly let’s setup and try the kube-scheduler-simulator. The procedure is pretty easy.
Execute the following command:

$ git clone https://github.com/kubernetes-sigs/kube-scheduler-simulator.git
$ cd kube-scheduler-simulator
$ git checkout 9de8c472f348b31437cce5ca2a34506f874cdddb
$ make docker_build_and_up

FYI,

  • I tested with the commit 9de8c472f348b31437cce5ca2a34506f874cdddb
  • If your PC is at behind proxy, you may need to unset http_proxy or something like that
  • If you are working over ssh, make tunnel for tcp 1212, 3000, and 3131

Then, open http://localhost:3000 and try adding some nodes and pods. The default behavior is very intuitive, the pods are spread over nodes
so that making load of each node to be equal.

Add a minimal scheduler to kube-scheduler-simulator

I hire the “minisched”, developed by Kensei Nakada-san who firstly developed kube-scheduler-simulator, as the base implementation for the new scheduler we are going to develop.

The minisched is a part of mini-kube-scheduler, which is a demonstration system, designed for education purpose. Although mini-kube-scheduler is based on kube-scheduler-simulator code, you can use only mini-kube-scheduler to develop your scheduler,
but as mini-kube-scheduler seems stopped updating for months, I decided to combine these two.

In order to use the minisched from kube-scheduler-simulator, the following procedure is needed.

  1. Copy mini-kube-scheduler/minisched (from branch initial-random-scheduler) into kube-scheduler-simulator
  2. Modify kube-scheduler-simulator/scheduler/scheduler.go to use the minisched
    (see the patch attached below)
  3. Examine the behavior change (minisched binds pods and nodes randomly)
Patch license: Apache-2.0 (same as kube-scheduler-simulator)

diff --git a/scheduler/scheduler.go b/scheduler/scheduler.go
index a5d5ca2..8eb931d 100644
--- a/scheduler/scheduler.go
+++ b/scheduler/scheduler.go
@@ -3,6 +3,8 @@ package scheduler
 import (
     "context"

+    "github.com/kubernetes-sigs/kube-scheduler-simulator/minisched"
+
     "golang.org/x/xerrors"
     v1 "k8s.io/api/core/v1"
     clientset "k8s.io/client-go/kubernetes"
@@ -14,7 +16,6 @@ import (
     "k8s.io/kubernetes/pkg/scheduler/apis/config"
     "k8s.io/kubernetes/pkg/scheduler/apis/config/scheme"
     "k8s.io/kubernetes/pkg/scheduler/apis/config/v1beta2"
-    "k8s.io/kubernetes/pkg/scheduler/profile"

     simulatorschedconfig "github.com/kubernetes-sigs/kube-scheduler-simulator/scheduler/config"
     "github.com/kubernetes-sigs/kube-scheduler-simulator/scheduler/plugin"
@@ -59,7 +60,6 @@ func (s *Service) ResetScheduler() error {
 // StartScheduler starts scheduler.
 func (s *Service) StartScheduler(versionedcfg *v1beta2config.KubeSchedulerConfiguration) error {
     clientSet := s.clientset
-    restConfig := s.restclientCfg
     ctx, cancel := context.WithCancel(context.Background())

     informerFactory := scheduler.NewInformerFactory(clientSet, 0)
@@ -71,36 +71,10 @@ func (s *Service) StartScheduler(versionedcfg *v1beta2config.KubeSchedulerConfig

     s.currentSchedulerCfg = versionedcfg.DeepCopy()

-    cfg, err := convertConfigurationForSimulator(versionedcfg)
-    if err != nil {
-        cancel()
-        return xerrors.Errorf("convert scheduler config to apply: %w", err)
-    }
-
-    registry, err := plugin.NewRegistry(informerFactory, clientSet)
-    if err != nil {
-        cancel()
-        return xerrors.Errorf("plugin registry: %w", err)
-    }
-
-    sched, err := scheduler.New(
+    sched := minisched.New(
         clientSet,
         informerFactory,
-        profile.NewRecorderFactory(evtBroadcaster),
-        ctx.Done(),
-        scheduler.WithKubeConfig(restConfig),
-        scheduler.WithProfiles(cfg.Profiles...),
-        scheduler.WithPercentageOfNodesToScore(cfg.PercentageOfNodesToScore),
-        scheduler.WithPodMaxBackoffSeconds(cfg.PodMaxBackoffSeconds),
-        scheduler.WithPodInitialBackoffSeconds(cfg.PodInitialBackoffSeconds),
-        scheduler.WithExtenders(cfg.Extenders...),
-        scheduler.WithParallelism(cfg.Parallelism),
-        scheduler.WithFrameworkOutOfTreeRegistry(registry),
     )
-    if err != nil {
-        cancel()
-        return xerrors.Errorf("create scheduler: %w", err)
-    }

     informerFactory.Start(ctx.Done())
     informerFactory.WaitForCacheSync(ctx.Done())

The minisched binds pods and nodes randomly.

Modify the algorithm

You can modify minisched so easily by editing minisched/minisched.go#L37 like this:

Patch license: MIT (same as mini-kube-scheduler)

diff --git a/minisched/minisched.go b/minisched/minisched.go
index 82c0043..ae02597 100644
--- a/minisched/minisched.go
+++ b/minisched/minisched.go
@@ -34,7 +34,7 @@ func (sched *Scheduler) scheduleOne(ctx context.Context) {
        klog.Info("minischeduler: Got Nodes successfully")

        // select node randomly
-       selectedNode := nodes.Items[rand.Intn(len(nodes.Items))]
+       selectedNode := nodes.Items[0]

        if err := sched.Bind(ctx, pod, selectedNode.Name); err != nil {
                klog.Error(err)

Our modified scheduler binds all pods to the same node.

Conclusion

References

Guest post originally published on the Miraxia blog
Source: CNCF Blog

Previous Spatial Clustering On BigQuery - Best Practices
Next Introducing On-Demand Backup, Schema Extension Support For Google Cloud’s Managed Microsoft AD

MENU

Back