Deploy TiDB on Azure AKS
This document describes how to deploy a TiDB cluster on Azure Kubernetes Service (AKS).
To deploy TiDB Operator and the TiDB cluster in a self-managed Kubernetes environment, refer toDeploy TiDB OperatorandDeploy TiDB on General Kubernetes.
Prerequisites
Before deploying a TiDB cluster on Azure AKS, perform the following operations:
InstallHelm 3for deploying TiDB Operator.
Deploy a Kubernetes (AKS) clusterand install and configure
az cli
.Refer touse Ultra disksto create a new cluster that can use Ultra disks or enable Ultra disks in an exist cluster.
AcquireAKS service permissions.
If the Kubernetes version of the cluster is earlier than 1.21, installaks-preview CLI extensionfor using Ultra Disks and registerEnableAzureDiskFileCSIDriverinyour subscription.
Install the aks-preview CLI extension:
az extension add --name aks-previewRegister
EnableAzureDiskFileCSIDriver
:az feature register --name EnableAzureDiskFileCSIDriver --namespace Microsoft.ContainerService --subscription${your-subscription-id}
Create an AKS cluster and a node pool
Most of the TiDB cluster components use Azure disk as storage. According toAKS Best Practices, when creating an AKS cluster, it is recommended to ensure that each node pool uses one availability zone (at least 3 in total).
Create an AKS cluster with CSI enabled
To create an AKS cluster withCSI enabled, run the following command:
# create AKS clusteraz aks create \ --resource-group${resourceGroup}\ --name${clusterName}\ --location${location}\ --generate-ssh-keys \ --vm-set-type VirtualMachineScaleSets \ --load-balancer-sku standard \ --node-count 3 \ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true
Create component node pools
After creating an AKS cluster, run the following commands to create component node pools. Each node pool may take two to five minutes to create. It is recommended to enableUltra disksin the TiKV node pool. For more details about cluster configuration, refer toaz aks
documentationandaz aks nodepool
documentation.
To create a TiDB Operator and monitor pool:
az aks nodepool add --name admin \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 1 \ --labels dedicated=adminCreate a PD node pool with
nodeType
beingStandard_F4s_v2
or higher:az aks nodepool add --name pd \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 3 \ --labels dedicated=pd \ --node-taints dedicated=pd:NoScheduleCreate a TiDB node pool with
nodeType
beingStandard_F8s_v2
or higher. You can set--node-count
to2
because only two TiDB nodes are required by default. You can also scale out this node pool by modifying this parameter at any time if necessary.az aks nodepool add --name tidb \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 2 \ --labels dedicated=tidb \ --node-taints dedicated=tidb:NoScheduleCreate a TiKV node pool with
nodeType
beingStandard_E8s_v4
or higher:az aks nodepool add --name tikv \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 3 \ --labels dedicated=tikv \ --node-taints dedicated=tikv:NoSchedule \ --enable-ultra-ssd
Deploy component node pools in availability zones
The Azure AKS cluster deploys nodes across multiple zones using "best effort zone balance". If you want to apply "strict zone balance" (not supported in AKS now), you can deploy one node pool in one zone. For example:
Create TiKV node pool 1 in zone 1:
az aks nodepool add --name tikv1 \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 1 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 1 \ --labels dedicated=tikv \ --node-taints dedicated=tikv:NoSchedule \ --enable-ultra-ssdCreate TiKV node pool 2 in zone 2:
az aks nodepool add --name tikv2 \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 2 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 1 \ --labels dedicated=tikv \ --node-taints dedicated=tikv:NoSchedule \ --enable-ultra-ssdCreate TiKV node pool 3 in zone 3:
az aks nodepool add --name tikv3 \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 1 \ --labels dedicated=tikv \ --node-taints dedicated=tikv:NoSchedule \ --enable-ultra-ssd
Configure StorageClass
To improve disk IO performance, it is recommended to addmountOptions
inStorageClass
to configurenodelalloc
andnoatime
. Refer toMount the data disk ext4 filesystem with options on the target machines that deploy TiKV.
kind:
StorageClass
apiVersion:
storage.k8s.io/v1
# ...
mountOptions:
-
nodelalloc,noatime
Deploy TiDB Operator
Deploy TiDB Operator in the AKS cluster by referring toDeploy TiDB Operatorsection.
Deploy a TiDB cluster and the monitoring component
This section describes how to deploy a TiDB cluster and its monitoring component on Azure AKS.
Create namespace
To create a namespace to deploy the TiDB cluster, run the following command:
kubectl create namespace tidb-cluster
Deploy
First, download the sampleTidbCluster
andTidbMonitor
configuration files:
curl -O https://raw.githubusercontent.com/pingcap/tidb-operator/v1.5.0/examples/aks/tidb-cluster.yaml && \ curl -O https://raw.githubusercontent.com/pingcap/tidb-operator/v1.5.0/examples/aks/tidb-monitor.yaml && \ curl -O https://raw.githubusercontent.com/pingcap/tidb-operator/v1.5.0/examples/aks/tidb-dashboard.yaml
Refer toconfigure the TiDB clusterto further customize and configure the CR before applying.
To deploy theTidbCluster
andTidbMonitor
CR in the AKS cluster, run the following command:
kubectl apply -f tidb-cluster.yaml -n tidb-cluster && \ kubectl apply -f tidb-monitor.yaml -n tidb-cluster
After the yaml file above is applied to the Kubernetes cluster, TiDB Operator creates the desired TiDB cluster and its monitoring component according to the yaml file.
View the cluster status
To view the status of the TiDB cluster, run the following command:
kubectl get pods -n tidb-cluster
When all the pods are in theRunning
orReady
state, the TiDB cluster is successfully started. For example:
NAME READY STATUS RESTARTS AGE tidb-discovery-5cb8474d89-n8cxk 1/1 Running 0 47h tidb-monitor-6fbcc68669-dsjlc 3/3 Running 0 47h tidb-pd-0 1/1 Running 0 47h tidb-pd-1 1/1 Running 0 46h tidb-pd-2 1/1 Running 0 46h tidb-tidb-0 2/2 Running 0 47h tidb-tidb-1 2/2 Running 0 46h tidb-tikv-0 1/1 Running 0 47h tidb-tikv-1 1/1 Running 0 47h tidb-tikv-2 1/1 Running 0 47h
Access the database
After deploying a TiDB cluster, you can access the TiDB database to test or develop applications.
Access method
- Access via Bastion
The LoadBalancer created for your TiDB cluster resides in an intranet. You can create aBastionin the cluster virtual network to connect to an internal host and then access the database.
- Access via SSH
You cancreate the SSH connection to a Linux nodeto access the database.
- Access via node-shell
You can simply use tools likenode-shellto connect to nodes in the cluster, then access the database.
Access via the MySQL client
After access to the internal host via SSH, you can access the TiDB cluster through the MySQL client.
Install the MySQL client on the host:
sudo yum install mysql -yConnect the client to the TiDB cluster:
mysql --comments -h${tidb-lb-ip}-P 4000 -u root${tidb-lb-ip}
is the LoadBalancer IP address of the TiDB service. To obtain it, run thekubectl得到svc basic-tidb - n tidb-cluster
command. TheEXTERNAL-IP
field returned is the IP address.For example:
$ mysql --comments -h 20.240.0.7 -P 4000 -u root Welcome to the MariaDB monitor. Commands end with ; or \g. Your MySQL connectionidis 1189 Server version: 5.7.25-TiDB-v7.1.0 TiDB Server (Apache License 2.0) Community Edition, MySQL 5.7 compatible Copyright (c) 2000, 2022, Oracle and/or its affiliates. Type'help;'or'\h' for help. Type'\c'to clear the current input statement. MySQL [(none)]> show status; +--------------------+--------------------------------------+ | Variable_name | Value | +--------------------+--------------------------------------+ | Ssl_cipher | | | Ssl_cipher_list | | | Ssl_verify_mode | 0 | | Ssl_version | | | ddl_schema_version | 22 | | server_id | ed4ba88b-436a-424d-9087-977e897cf5ec | +--------------------+--------------------------------------+ 6 rowsin set(0.00 sec)
Access the Grafana monitoring dashboard
Obtain the LoadBalancer IP address of Grafana:
kubectl -n tidb-cluster get svc basic-grafana
For example:
kubectl get svc basic-grafana NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE basic-grafana LoadBalancer 10.100.199.42 20.240.0.8 3000:30761/TCP 121m
In the output above, theEXTERNAL-IP
column is the LoadBalancer IP address.
You can access the${grafana-lb}:3000
address using your web browser to view monitoring metrics. Replace${grafana-lb}
with the LoadBalancer IP address.
Access TiDB Dashboard
SeeAccess TiDB Dashboardfor instructions about how to securely allow access to TiDB Dashboard.
Upgrade
To upgrade the TiDB cluster, execute the following command:
kubectl patch tc basic -n tidb-cluster --typemerge -p'{"spec":{"version":"${version}"}}`.
The upgrade process does not finish immediately. You can view the upgrade progress by running thekubectl get pods -n tidb-cluster --watch
command.
Scale out
Before scaling out the cluster, you need to scale out the corresponding node pool so that the new instances have enough resources for operation.
This section describes how to scale out the AKS node pool and TiDB components.
Scale out AKS node pool
When scaling out TiKV, the node pools must be scaled out evenly among availability zones. The following example shows how to scale out the TiKV node pool of the${clusterName}
cluster to 6 nodes:
az aks nodepool scale \ --resource-group${resourceGroup}\ --cluster-name${clusterName}\ --name${nodePoolName}\ --node-count 6
For more information on node pool management, refer toaz aks nodepool
.
Scale out TiDB components
After scaling out the AKS node pool, run thekubectl edit tc basic -n tidb-cluster
command withreplicas
of each component set to desired value. The scaling-out process is then completed.
Deploy TiFlash/TiCDC
TiFlashis the columnar storage extension of TiKV.
TiCDCis a tool for replicating the incremental data of TiDB by pulling TiKV change logs.
The two components arenot requiredin the deployment. This section shows a quick start example.
Add node pools
Add a node pool for TiFlash/TiCDC respectively. You can set--node-count
as required.
Create a TiFlash node pool with
nodeType
beingStandard_E8s_v4
or higher:az aks nodepool add --name tiflash \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\,节点数3 \——标签专用= tiflash \node-taints dedicated=tiflash:NoScheduleCreate a TiCDC node pool with
nodeType
beingStandard_E16s_v4
or higher:az aks nodepool add --name ticdc \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size${nodeType}\ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 3 \ --labels dedicated=ticdc \ --node-taints dedicated=ticdc:NoSchedule
Configure and deploy
To deploy TiFlash, configure
spec.tiflash
intidb-cluster.yaml
. The following is an example:spec: ... tiflash: baseImage: pingcap/tiflash maxFailoverCount: 0 replicas: 1 storageClaims: - resources: requests: storage: 100Gi tolerations: - effect: NoSchedule key: dedicated operator: Equal value: tiflashFor other parameters, refer toConfigure a TiDB Cluster.
To deploy TiCDC, configure
spec.ticdc
intidb-cluster.yaml
. The following is an example:spec: ... ticdc: baseImage: pingcap/ticdc replicas: 1 tolerations: - effect: NoSchedule key: dedicated operator: Equal value: ticdcModify
replicas
as required.
Finally, run thekubectl -n tidb-cluster apply -f tidb-cluster.yaml
command to update the TiDB cluster configuration.
For detailed CR configuration, refer toAPI referencesandConfigure a TiDB Cluster.
Use other Disk volume types
Azure disks support multiple volume types. Among them,UltraSSD
delivers low latency and high throughput and can be enabled by performing the following steps:
Enable Ultra disks on an existing clusterand create a storage class for
UltraSSD
:apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ultra provisioner: disk.csi.azure.com parameters: skuname: UltraSSD_LRS # alias: storageaccounttype, available values: Standard_LRS, Premium_LRS, StandardSSD_LRS, UltraSSD_LRS cachingMode: None reclaimPolicy: 删除 allowVolumeExpansion: true volumeBindingMode: WaitForFirstConsumer mountOptions: - nodelalloc,noatimeYou can add moreDriver Parametersas required.
In
tidb-cluster.yaml
, specify theultra
storage class to apply for theUltraSSD
volume type through thestorageClassName
field.The following is a TiKV configuration example you can refer to:
spec: tikv: ... storageClassName: ultra
You can use any supported Azure disk type. It is recommended to usePremium_LRS
orUltraSSD_LRS
.
For more information about the storage class configuration and Azure disk types, refer toStorage Class documentationandAzure Disk Types.
Use local storage
Use Azure LRS disks for storage in production environment. To simulate bare-metal performance, use additionalNVMe SSD local store volumesprovided by some Azure instances. You can choose such instances for the TiKV node pool to achieve higher IOPS and lower latency.
For instance types that provide local disks, refer toLsv2-series. The following takesStandard_L8s_v2
as an example:
Create a node pool with local storage for TiKV.
Modify the instance type of the TiKV node pool in the
az aks nodepool add
command toStandard_L8s_v2
:az aks nodepool add --name tikv \ --cluster-name${clusterName}\ --resource-group${resourceGroup}\ --node-vm-size Standard_L8s_v2 \ --zones 1 2 3 \ --aks-custom-headers EnableAzureDiskFileCSIDriver=true\ --node-count 3 \ --enable-ultra-ssd \ --labels dedicated=tikv \ --node-taints dedicated=tikv:NoScheduleIf the TiKV node pool already exists, you can either delete the old group and then create a new one, or change the group name to avoid conflict.
Deploy the local volume provisioner.
You need to use thelocal-volume-provisionerto discover and manage the local storage. Run the following command to deploy and create a
local-storage
storage class:kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/v1.5.0/manifests/eks/local-volume-provisioner.yamlUse local storage.
After the steps above, the local volume provisioner can discover all the local NVMe SSD disks in the cluster.
Add the
tikv.storageClassName
field to thetidb-cluster.yaml
file and set the value of the field tolocal-storage
.For more information, refer toDeploy TiDB cluster and its monitoring components