Sign In Try Free

Persistent Storage Class Configuration on Kubernetes

TiDB cluster components such as PD, TiKV, TiDB monitoring, TiDB Binlog, andtidb-backuprequire persistent storage for data. To achieve this on Kubernetes, you need to usePersistentVolume (PV). Kubernetes supports different types ofstorage classes, which can be categorized into two main types:

  • Network storage

    Network storage is not located on the current node but is mounted to the node through the network. It usually has redundant replicas to ensure high availability. In the event of a node failure, the corresponding network storage can be remounted to another node for continued use.

  • Local storage

    Local storage is located on the current node and typically provides lower latency compared to network storage. However, it does not have redundant replicas, so data might be lost if the node fails. If the node is an IDC server, data can be partially restored, but if it is a virtual machine using local disk on a public cloud, data cannot be retrieved after a node failure.

PVs are automatically created by the system administrator or volume provisioner. PVs and Pods are bound byPersistentVolumeClaim (PVC). Instead of creating a PV directly, users request to use a PV through a PVC. The corresponding volume provisioner creates a PV that meets the requirements of the PVC and then binds the PV to the PVC.

TiKV uses the Raft protocol to replicate data. When a node fails, PD automatically schedules data to fill the missing data replicas. TiKV requires low read and write latency, so it is strongly recommended to use local SSD storage in a production environment.

PD also uses Raft to replicate data. PD is not an I/O-intensive application, but rather a database for storing cluster meta information. Therefore, a local SAS disk or network SSD storage such as EBS General Purpose SSD (gp2) volumes on AWS or SSD persistent disks on Google Cloud can meet the requirements.

To ensure availability, it is recommended to use network storage for components such as TiDB monitoring, TiDB Binlog, andtidb-backupbecause they do not have redundant replicas. TiDB Binlog's Pump and Drainer components are I/O-intensive applications that require low read and write latency, so it is recommended to use high-performance network storage such as EBS Provisioned IOPS SSD (io1) volumes on AWS or SSD persistent disks on Google Cloud.

When deploying TiDB clusters ortidb-backupwith TiDB Operator, you can configure theStorageClassfor the components that require persistent storage via the correspondingstorageClassNamefield in thevalues.yamlconfiguration file. TheStorageClassNameis set tolocal-storageby default.

Network PV configuration

Starting from Kubernetes 1.11, volume expansion of network PV is supported. However, you need to run the following command to enable volume expansion for the correspondingStorageClass:


              
kubectl patch storageclass${storage_class}-p'{"allowVolumeExpansion": true}'

After enabling volume expansion, you can expand the PV using the following method:

  1. Edit the PersistentVolumeClaim (PVC) object:

    Suppose the PVC is currently 10 Gi and you need to expand it to 100 Gi.

    
                    
    kubectl patch pvc -n${namespace} ${pvc_name}-p'{"spec": {"resources": {"requests": {"storage": "100Gi"}}}}'
  2. 视图大小of the PV:

    After the expansion, the size displayed by runningkubectl get pvc -n ${namespace} ${pvc_name}still shows the original size. However, if you run the following command to view the size of the PV, it shows that the size has been expanded to the expected value.

    
                    
    kubectl get pv | grep${pvc_name}

Local PV configuration

Currently, Kubernetes supports statically allocated local storage. To create a local storage object, uselocal-volume-provisionerin thelocal-static-provisionerrepository.

Step 1: Pre-allocate local storage

  • For a disk that stores TiKV data, you canmountthe disk into the/mnt/ssd目录中。

    To achieve high performance, it is recommended to allocate a dedicated disk for TiDB, with SSD being the recommended disk type.

  • For a disk that stores PD data, follow thestepsto mount the disk. First, create multiple directories on the disk and bind mount the directories into the/mnt/sharedssd目录中。

  • For a disk that stores monitoring data, follow thestepsto mount the disk. First, create multiple directories on the disk and bind mount the directories into the/mnt/monitoring目录中。

  • For a disk that stores TiDB Binlog and backup data, follow thestepsto mount the disk. First, create multiple directories on the disk and bind mount the directories into the/mnt/backup目录中。

The/mnt/ssd,/mnt/sharedssd,/mnt/monitoring, and/mnt/backupdirectories mentioned above are discovery directories used by local-volume-provisioner. For each subdirectory in the discovery directory, local-volume-provisioner creates a corresponding PV.

Step 2: Deploy local-volume-provisioner

Online deployment

  1. Download the deployment file for the local-volume-provisioner.

    
                    
    wget https://raw.githubusercontent.com/pingcap/tidb-operator/v1.5.0/examples/local-pv/local-volume-provisioner.yaml
  2. If you are using the same discovery directory as described inStep 1: Pre-allocate local storage, you can skip this step. If you are using a different path for the discovery directory than in the previous step, you need to modify the ConfigMap and DaemonSet spec.

    • Modify thedata.storageClassMapfield in the ConfigMap spec:

      
                        
      apiVersion: v1 kind: ConfigMap metadata: name: local-provisioner-config namespace: kube-system data: # ... storageClassMap: | ssd存储设备:hostDir: / mnt / ssd mountDir: / mnt /党卫军d shared-ssd-storage: hostDir: /mnt/sharedssd mountDir: /mnt/sharedssd monitoring-storage: hostDir: /mnt/monitoring mountDir: /mnt/monitoring backup-storage: hostDir: /mnt/backup mountDir: /mnt/backup

      For more configuration options for the local-volume-provisioner, refer to the配置document.

    • Modify thevolumesandvolumeMountsfields in the DaemonSet spec to ensure that the discovery directory can be mounted to the corresponding directory in the Pod:

      
                        
      ...... volumeMounts: - mountPath: /mnt/ssd name: local-ssd mountPropagation: “HostToContainer” - mountPath: /mnt/sharedssd name: local-sharedssd mountPropagation: “HostToContainer” - mountPath: /mnt/backup name: local-backup mountPropagation: “HostToContainer” - mountPath: /mnt/monitoring name: local-monitoring mountPropagation: “HostToContainer” volumes: - name: local-ssd hostPath: path: /mnt/ssd - name: local-sharedssd hostPath: path: /mnt/sharedssd - name: local-backup hostPath: path: /mnt/backup - name: local-monitoring hostPath: path: /mnt/monitoring ......
  3. Deploy thelocal-volume-provisioner.

    
                    
    kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/v1.5.0/manifests/local-dind/local-volume-provisioner.yaml
  4. Check the status of the Pod and PV.

    
                    
    kubectl get po -n kube-system -l app=local-volume-provisioner && \ kubectl get pv | grep -e ssd-storage -e shared-ssd-storage -e monitoring-storage -e backup-storage

    Thelocal-volume-provisionercreates a PV for each mounting point under the discovery directory.

For more information, refer to theKubernetes local storageandlocal-static-provisionerdocuments.

Offline deployment

The steps for offline deployment are the same as for online deployment, except for the following:

  • Download thelocal-volume-provisioner.yamlfile on a machine with Internet access, then upload it to the server and install it.

  • Thelocal-volume-provisioneris a DaemonSet that starts a Pod on every Kubernetes worker node. The Pod uses thequay.io/external_storage/local-volume-provisioner:v2.3.4image. If the server does not have access to the Internet, download this Docker image on a machine with Internet access:

    
                    
    docker pull quay.io/external_storage/local-volume-provisioner:v2.3.4 docker save -o local-volume-provisioner-v2.3.4.tar quay.io/external_storage/local-volume-provisioner:v2.3.4

    Copy thelocal-volume-provisioner-v2.3.4.tarfile to the server, and execute thedocker loadcommand to load the file on the server:

    
                    
    docker load -i local-volume-provisioner-v2.3.4.tar

Best practices

  • The unique identifier for a local PV is its path. To avoid conflicts, it is recommended to generate a unique path using the UUID of the device.
  • To ensure I/O isolation, it is recommended to use a dedicated physical disk per PV for hardware-based isolation.
  • For capacity isolation, it is recommended to use either a partition per PV or a physical disk per PV.

For more information on local PV on Kubernetes, refer to theBest Practicesdocument.

Data safety

In general, when a PVC is deleted and no longer in use, the PV bound to it is reclaimed and placed in the resource pool for scheduling by the provisioner. To prevent accidental data loss, you can configure the reclaim policy of theStorageClasstoRetainglobally or change the reclaim policy of a single PV toRetain. With theRetainpolicy, a PV is not automatically reclaimed.

  • To configure globally:

    The reclaim policy of aStorageClassis set at creation time and cannot be updated once created. If it is not set during creation, you can create anotherStorageClasswith the same provisioner. For example, the default reclaim policy of theStorageClassfor persistent disks on Google Kubernetes Engine (GKE) is删除. You can create anotherStorageClassnamedpd-standardwith a reclaim policy ofRetainand change thestorageClassNameof the corresponding component topd-standardwhen creating a TiDB cluster.

    
                    
    apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: pd-standard parameters: type: pd-standard provisioner: kubernetes.io/gce-pd reclaimPolicy: Retain volumeBindingMode: Immediate
  • To configure a single PV:

    
                    
    kubectl patch pv${pv_name}-p'{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

删除PV and data

When the reclaim policy of PVs is set toRetain, if you have confirmed that the data of a PV can be deleted, you can delete the PV and its corresponding data by following these steps:

  1. 删除the PVC object corresponding to the PV:

    
                    
    kubectl delete pvc${pvc_name}--namespace=${namespace}
  2. Set the reclaim policy of the PV to删除. This automatically deletes and reclaims the PV.

    
                    
    kubectl patch pv${pv_name}-p'{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'

For more details, refer to theChange the Reclaim Policy of a PersistentVolumedocument.

Download PDF Request docs changes Ask questions on Discord
Playground
New
One-stop & interactive experience of TiDB's capabilities WITHOUT registration.
Was this page helpful?
Products
TiDB
TiDB Dedicated
TiDB Serverless
Pricing
Get Demo
Get Started
©2023PingCAP. All Rights Reserved.