TiDB Scheduler
TiDB Scheduler is a TiDB implementation ofKubernetes scheduler extender.TiDB Scheduler is used to add new scheduling rules to Kubernetes. This document introduces these new scheduling rules and how TiDB Scheduler works.
tidb-scheduler and default-scheduler
Akube-scheduleris deployed by default in the Kubernetes cluster for Pod scheduling. The default scheduler name isdefault-scheduler
.
In the early Kubernetes versions (< v1.16), thedefault-scheduler
没有足够灵活以支持甚至scheduling for Pods. Therefore, to support even scheduling for the TiDB cluster Pods, TiDB Operator uses a TiDB Scheduler (tidb-scheduler
) to extend the scheduling rules of thedefault-scheduler
.
Starting from Kubernetes v1.16, thedefault-scheduler
has introduced theEvenPodsSpread
feature.This feature controls how Pods are spread across your Kubernetes cluster among failure-domains. It is in the beta phase in v1.18, and became generally available in v1.19.
Therefore, if the Kubernetes cluster meets one of the following conditions, you can usedefault-scheduler
directly instead oftidb-scheduler
.You need to configuretopologySpreadConstraints
to make Pods evenly spread in different topologies.
- The Kubernetes version is v1.18.x andthe
EvenPodsSpread
feature gateis enabled. - The Kubernetes version >= v1.19.
TiDB cluster scheduling requirements
A TiDB cluster includes three key components: PD, TiKV, and TiDB. Each consists of multiple nodes: PD is a Raft cluster, and TiKV is a multi-Raft group cluster. PD and TiKV components are stateful. If theEvenPodsSpread
feature gate is not enabled in the Kubernetes cluster, the default scheduling rules of the Kubernetes scheduler cannot meet the high availability scheduling requirements of the TiDB cluster, so the Kubernetes scheduling rules need to be extended.
Currently, pods can be scheduled according to specific dimensions by modifyingmetadata.annotations
in TidbCluster, such as:
metadata:
annotations:
m.rzhenli.com/ha-topology-key:
kubernetes.io/hostname
The configuration above indicates scheduling by the node dimension (default). If you want to schedule pods by other dimensions, such asm.rzhenli.com/ha-topology-key: zone
, which means scheduling by zone, each node should also be labeled as follows:
kubectl label nodes node1 zone=zone1
Different nodes may have different labels or the same label, and if a node is not labeled, the scheduler will not schedule any pod to that node.
TiDB Scheduler implements the following customized scheduling rules. The following example is based on node scheduling, scheduling rules based on other dimensions are the same.
PD component
Scheduling rule 1: Make sure that the number of PD instances scheduled on each node is less thanReplicas / 2
.例如:
PD cluster size (Replicas) | Maximum number of PD instances that can be scheduled on each node |
---|---|
1 | 1 |
2 | 1 |
3 | 1 |
4 | 1 |
5 | 2 |
... |
TiKV component
Scheduling rule 2: If the number of Kubernetes nodes is less than three (in this case, TiKV cannot achieve high availability), scheduling is not limited; otherwise, the number of TiKV instances that can be scheduled on each node is no more thanceil(Replicas / 3)
.例如:
TiKV cluster size (Replicas) | Maximum number of TiKV instances that can be scheduled on each node | Best scheduling distribution |
---|---|---|
3 | 1 | 1,1,1 |
4 | 2 | 1,1,2 |
5 | 2 | 1,2,2 |
6 | 2 | 2,2,2 |
7 | 3 | 2,2,3 |
8 | 3 | 2,3,3 |
... |
How TiDB Scheduler works
TiDB Scheduler adds customized scheduling rules by implementing KubernetesScheduler extender.
The TiDB Scheduler component is deployed as one or more Pods, though only one Pod is working at the same time. Each Pod has two Containers inside: one Container is a nativekube-scheduler
, and the other is atidb-scheduler
implemented as a Kubernetes scheduler extender.
If you configure the cluster to usetidb-scheduler
in theTidbCluster
CR, the.spec.schedulerName
attribute of PD, TiDB, and TiKV Pods created by TiDB Operator is set totidb-scheduler
.This means that the TiDB Scheduler is used for the scheduling.
The scheduling process of a Pod is as follows:
- First,
kube-scheduler
pulls all Pods whose.spec.schedulerName
istidb-scheduler
.And Each Pod is filtered using the default Kubernetes scheduling rules. - 然后,
kube-scheduler
sends a request to thetidb-scheduler
服务。然后tidb-scheduler
filters the sent nodes through the customized scheduling rules (as mentioned above), and returns schedulable nodes tokube-scheduler
. - Finally,
kube-scheduler
determines the nodes to be scheduled.
If a Pod cannot be scheduled, see thetroubleshooting documentto diagnose and solve the issue.