Installing CoScheduling for Batch Scheduling

Background
For AI, big data, and other multi-task collaboration scenarios, there is an "All-or-Nothing" requirement for scheduling, meaning all tasks must be scheduled at the same time. CoScheduling is an open-source solution that schedules a group of Pods (or PodGroups) to the same node simultaneously within a Kubernetes cluster. This document will explain how to install CoScheduling for batch scheduling on TKE.
Prerequisites
A TKE cluster has been created.
Helm has been already installed.
The TKE cluster's kubeconfig has been configured, with permissions to operate the TKE cluster granted. For details, please refer to Connect to the Cluster.
Using Helm for Installation
Installing CoScheduler as the Second Scheduler
When scheduling pods, it is required to specify schedulerName as scheduler-plugins-scheduler by using the following command:
$ git clone git@github.com:kubernetes-sigs/scheduler-plugins.git
$ cd scheduler-plugins/manifests/install/charts
$ helm install scheduler-plugins as-a-second-scheduler/ --create-namespace --namespace scheduler-plugins
Verifying the Installation
Run the following command to observe the Pod operating status.
$ kubectl get deploy -n scheduler-plugins
Expected output:
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
scheduler-plugins-controller   1/1     1            1           7s
scheduler-plugins-scheduler    1/1     1            1           7s
How to Use?
PodGroup
PodGroup is a custom resource from the CoScheduling component, used to define the minimum number of Pods that need to be scheduled simultaneously. By setting a tag, you can indicate which Pod belongs to a particular PodGroup. Below is a standard example of the PodGroup CRD:
# PodGroup CRD spec
apiVersion: scheduling.x-k8s.io/v1alpha1
kind: PodGroup
metadata:
  name: nginx
spec:
  scheduleTimeoutSeconds: 10
  minMember: 3
---
# Add a label `scheduling.x-k8s.io/pod-group` to mark the pod belongs to a group
labels:
  scheduling.x-k8s.io/pod-group: nginx
We will calculate the sum of running and pending (assumed but unbound) pods in the scheduler. If the sum is greater than or equal to minMember, a pending pod will be created. Pods with different priorities within the same PodGroup may cause unexpected behaviors. Therefore, it's essential to ensure that the Pods within the same PodGroup have the same priority.
Sample
Assume we have a cluster that can accommodate only 3 nginx pods. We create a ReplicaSet with replicas=6 and set the value of minMember to 3.
apiVersion: scheduling.x-k8s.io/v1alpha1
kind: PodGroup
metadata:
  name: nginx
spec:
  scheduleTimeoutSeconds: 10
  minMember: 3
---
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 6
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
        scheduling.x-k8s.io/pod-group: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        resources:
          limits:
            cpu: 3000m
            memory: 500Mi
          requests:
            cpu: 3000m
            memory: 500Mi
Three Pods will be scheduled together as follows:
$ kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
nginx-4jw2m   0/1     Pending   0          55s
nginx-4mn52   1/1     Running   0          55s
nginx-c9gv8   1/1     Running   0          55s
nginx-frm24   0/1     Pending   0          55s
nginx-hsflk   0/1     Pending   0          55s
nginx-qtj5f   1/1     Running   0          55s
If you now change the value of minMember to 4, all nginx pods will be in a pending state because the value of 3 for minMember defined by the PodGroup is not met:
$ kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
nginx-4vqrk   0/1     Pending   0          3s
nginx-bw9nn   0/1     Pending   0          3s
nginx-gnjsv   0/1     Pending   0          3s
nginx-hqhhz   0/1     Pending   0          3s
nginx-n47r7   0/1     Pending   0          3s
nginx-n7vtq   0/1     Pending   0          3s
﻿
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

Background

Prerequisites

Using Helm for Installation

Installing CoScheduler as the Second Scheduler

Verifying the Installation

How to Use?

PodGroup

Sample

About Tencent Cloud

Help & Support

Resources

User Center

tencent cloud

Sign Up

Log in

Compute

Microservice

Data Migration

Database SaaS Tool

Data Security

Application Security

Big Data

Voice Technology

Internet of Things

Stream Services

Cloud Real-time Rendering

Cloud Resource Management

More

Edge Computing

Serverless

Relational Database

Networking

Business Security

Domains & Websites

Face Recognition

AI Platform Service

Middleware

Media On-Demand

Game Services

Management and Audit Tools

Container

Essential Storage Service

Enterprise Distributed DBMS

CDN and Acceleration

Security Services

Enterprise Applications

Image Creation

Natural Language Processing

Communication

Media Process Services

Education Sevices

Developer Tools

Distributed cloud

Data Process and Analysis

NoSQL Database

Network Security

Cloud Security

Office Collaboration

Tencent Big Model

Optical Character Recognition

Interactive Video Services

Media SDK

Medical Services

Monitor and Operation

Background

Prerequisites

Using Helm for Installation

Installing CoScheduler as the Second Scheduler

Verifying the Installation

How to Use?

PodGroup

Sample

About Tencent Cloud

Help & Support

Resources

User Center