Sign In Try Free

PD Control User Guide

As a command line tool of PD, PD Control obtains the state information of the cluster and tunes the cluster.

Install PD Control

Use TiUP command

To use PD Control, execute thetiup ctl:v pd -u http://: [-i]command.

下载安装包

To obtainpd-ctlof the latest version, download the TiDB server installation package.pd-ctlis included in thectl-{version}-linux-{arch}.tar.gzpackage.

Installation package OS Architecture SHA256 checksum
https://download.pingcap.org/tidb-community-server-{version}-linux-amd64.tar.gz(pd-ctl) Linux amd64 https://download.pingcap.org/tidb-community-server-{version}-linux-amd64.tar.gz.sha256
https://download.pingcap.org/tidb-community-server-{version}-linux-arm64.tar.gz(pd-ctl) Linux arm64 https://download.pingcap.org/tidb-community-server-{version}-linux-arm64.tar.gz.sha256

Compile from source code

  1. Go1.20 or later is required because the Go modules are used.
  2. In the root directory of thePD project, use themakeormake pd-ctlcommand to compile and generatebin/pd-ctl.

Usage

Single-command mode:


              
tiup ctl:v pd store -u http://127.0.0.1:2379

Interactive mode:


              
tiup ctl:v pd -i -u http://127.0.0.1:2379

Use environment variables:


              
exportPD_ADDR=http://127.0.0.1:2379 tiup ctl:v pd

Use TLS to encrypt:


              
tiup ctl:v pd -u https://127.0.0.1:2379 --cacert="path/to/ca"--cert="path/to/cert"--key="path/to/key"

Command line flags

--cacert

  • Specifies the path to the certificate file of the trusted CA in PEM format
  • Default: ""

--cert

  • Specifies the path to the certificate of SSL in PEM format
  • Default: ""

--detach/-d

  • Uses the single command line mode (not entering readline)
  • Default: true

--help/-h

  • Outputs the help information
  • Default: false

--interact/-i

  • Uses the interactive mode (entering readline)
  • Default: false

--key

  • Specifies the path to the certificate key file of SSL in PEM format, which is the private key of the certificate specified by--cert
  • Default: ""

--pd/-u

  • Specifies the PD address
  • Default address:http://127.0.0.1:2379
  • Environment variable:PD_ADDR

--version/-V

  • Prints the version information and exit
  • Default: false

Command

cluster

Use this command to view the basic information of the cluster.

Usage:


              
>> cluster // To show the cluster information {"id": 6493707687106161130,"max_peer_count": 3 }

config [show | set

Use this command to view or modify the configuration information.

Usage:


              
>> config show // Display the config information of the scheduling {"replication": {"enable-placement-rules":"true","isolation-level":"","location-labels":"","max-replicas": 3,"strictly-match-label":"false"},"schedule": {"enable-cross-table-merge":"true","high-space-ratio": 0.7,"hot-region-cache-hits-threshold": 3,“hot-region-schedule-limit": 4,"leader-schedule-limit": 4,"leader-schedule-policy":"count","low-space-ratio": 0.8,"max-merge-region-keys": 200000,"max-merge-region-size": 20,"max-pending-peer-count": 64,"max-snapshot-count": 64,"max-store-down-time":"30m0s","merge-schedule-limit": 8,"patrol-region-interval":"10ms","region-schedule-limit": 2048,"region-score-formula-version":"v2","replica-schedule-limit": 64,"scheduler-max-waiting-operator": 5,"split-merge-interval":"1h0m0s","tolerant-size-ratio": 0 } } >> config show all // Display all config information >> config show replication // Display the config information of replication {"max-replicas": 3,"location-labels":"","isolation-level":"","strictly-match-label":"false","enable-placement-rules":"true"} >> config show cluster-version // Display the current version of the cluster,whichis the current minimum version of TiKV nodesinthe cluster and does not correspond to the binary version."5.2.2"
  • max-snapshot-countcontrols the maximum number of snapshots that a single store receives or sends out at the same time. The scheduler is restricted by this configuration to avoid taking up normal application resources. When you need to improve the speed of adding replicas or balancing, increase this value.

    
                    
    configsetmax-snapshot-count 64 // Set the maximum number of snapshots to 64
  • max-pending-peer-countcontrols the maximum number of pending peers in a single store. The scheduler is restricted by this configuration to avoid producing a large number of Regions without the latest log in some nodes. When you need to improve the speed of adding replicas or balancing, increase this value. Setting it to 0 indicates no limit.

    
                    
    configsetmax-pending-peer-count 64 // Set the maximum number of pending peers to 64
  • max-merge-region-sizecontrols the upper limit on the size of Region Merge (the unit is MiB). WhenregionSizeexceeds the specified value, PD does not merge it with the adjacent Region. Setting it to 0 indicates disabling Region Merge.

    
                    
    configsetmax-merge-region-size 16 // Set the upperlimiton the size of Region Merge to 16 MiB
  • max-merge-region-keyscontrols the upper limit on the key count of Region Merge. WhenregionKeyCountexceeds the specified value, PD does not merge it with the adjacent Region.

    
                    
    configsetmax-merge-region-keys 50000 / /设置上limiton keyCount to 50000
  • split-merge-intervalcontrols the interval between thesplitandmergeoperations on a same Region. This means the newly split Region won't be merged within a period of time.

    
                    
    configsetsplit-merge-interval 24h // Set the interval between `split` and `merge` to one day
  • enable-one-way-mergecontrols whether PD only allows a Region to merge with the next Region. When you set it tofalse, PD allows a Region to merge with the adjacent two Regions.

    
                    
    configsetenable-one-way-mergetrue// Enables one-way merging.
  • enable-cross-table-mergeis used to enable the merging of cross-table Regions. When you set it tofalse, PD does not merge the Regions from different tables. This option only works when key type is "table".

    
                    
    configsetenable-cross-table-mergetrue// Enable cross table merge.
  • 键式specifies the key encoding type used for the cluster. The supported options are["table", "raw", "txn"], and the default value is "table".

    • If no TiDB instance exists in the cluster,键式will be "raw" or "txn", and PD is allowed to merge Regions across tables regardless of theenable-cross-table-mergesetting.

    • If any TiDB instance exists in the cluster,键式should be "table". Whether PD can merge Regions across tables is determined byenable-cross-table-merge. If键式is "raw", placement rules do not work.

      
                        
      configset键式raw // Enable cross table merge.
  • region-score-formula-versioncontrols the version of the Region score formula. The value options arev1andv2. The version 2 of the formula helps to reduce redundant balance Region scheduling in some scenarios, such as taking TiKV nodes online or offline.

    
                    
    configsetregion-score-formula-version v2
  • patrol-region-intervalcontrols the execution frequency thatreplicaCheckerchecks the health status of Regions. A shorter interval indicates a higher execution frequency. Generally, you do not need to adjust it.

    
                    
    configsetpatrol-region-interval 10ms // Set the execution frequency of replicaChecker to 10ms
  • max-store-down-timecontrols the time that PD decides the disconnected store cannot be restored if exceeded. If PD does not receive heartbeats from a store within the specified period of time, PD adds replicas in other nodes.

    
                    
    configsetmax-store-down-time 30m // Set the time withinwhichPD receives no heartbeats and afterwhichPD starts to add replicas to 30 minutes
  • max-store-preparing-time控制的最大等待时间存储go online. During the online stage of a store, PD can query the online progress of the store. When the specified time is exceeded, PD assumes that the store has been online and cannot query the online progress of the store again. But this does not prevent Regions from transferring to the new online store. In most scenarios, you do not need to adjust this parameter.

    The following command specifies that the maximum waiting time for the store to go online is 4 hours.

    
                    
    configsetmax-store-preparing-time 4h
  • leader-schedule-limitcontrols the number of tasks scheduling the leader at the same time. This value affects the speed of leader balance. A larger value means a higher speed and setting the value to 0 closes the scheduling. Usually the leader scheduling has a small load, and you can increase the value in need.

    
                    
    configsetleader-schedule-limit 4 // 4 tasks of leader scheduling at the same time at most
  • region-schedule-limitcontrols the number of tasks of scheduling Regions at the same time. This value avoids too many Region balance operators being created. The default value is2048which is enough for all sizes of clusters, and setting the value to0closes the scheduling. Usually, the Region scheduling speed is limited bystore-limit, but it is recommended that you do not customize this value unless you know exactly what you are doing.

    
                    
    configsetregion-schedule-limit 2 // 2 tasks of Region scheduling at the same time at most
  • replica-schedule-limitcontrols the number of tasks scheduling the replica at the same time. This value affects the scheduling speed when the node is down or removed. A larger value means a higher speed and setting the value to 0 closes the scheduling. Usually the replica scheduling has a large load, so do not set a too large value. Note that this configuration item is usually kept at the default value. If you want to change the value, you need to try a few values to see which one works best according to the real situation.

    
                    
    configsetreplica-schedule-limit 4 // 4 tasks of replica scheduling at the same time at most
  • merge-schedule-limitcontrols the number of Region Merge scheduling tasks. Setting the value to 0 closes Region Merge. Usually the Merge scheduling has a large load, so do not set a too large value. Note that this configuration item is usually kept at the default value. If you want to change the value, you need to try a few values to see which one works best according to the real situation.

    
                    
    configsetmerge-schedule-limit 16 // 16 tasks of Merge scheduling at the same time at most
  • hot-region-schedule-limitcontrols the hot Region scheduling tasks that are running at the same time. Setting its value to0means disabling the scheduling. It is not recommended to set a too large value. Otherwise, it might affect the system performance. Note that this configuration item is usually kept at the default value. If you want to change the value, you need to try a few values to see which one works best according to the real situation.

    
                    
    configsethot-region-schedule-limit 4 // 4 tasks of hot Region scheduling at the same time at most
  • hot-region-cache-hits-thresholdis used to set the number of minutes required to identify a hot Region. PD can participate in the hotspot scheduling only after the Region is in the hotspot state for more than this number of minutes.

  • tolerant-size-ratiocontrols the size of the balance buffer area. When the score difference between the leader or Region of the two stores is less than specified multiple times of the Region size, it is considered in balance by PD.

    
                    
    configsettolerant-size-ratio 20 // Set the size of the buffer area to about 20timesof the average Region Size
  • low-space-ratiocontrols the threshold value that is considered as insufficient store space. When the ratio of the space occupied by the node exceeds the specified value, PD tries to avoid migrating data to the corresponding node as much as possible. At the same time, PD mainly schedules the remaining space to avoid using up the disk space of the corresponding node.

    
                    
    configsetlow-space-ratio 0.9 // Set the threshold value of insufficient space to 0.9
  • high-space-ratiocontrols the threshold value that is considered as sufficient store space. This configuration takes effect only whenregion-score-formula-versionis set tov1. When the ratio of the space occupied by the node is less than the specified value, PD ignores the remaining space and mainly schedules the actual data volume.

    
                    
    configsethigh-space-ratio 0.5 // Set the threshold value of sufficient space to 0.5
  • cluster-versionis the version of the cluster, which is used to enable or disable some features and to deal with the compatibility issues. By default, it is the minimum version of all normally running TiKV nodes in the cluster. You can set it manually only when you need to roll it back to an earlier version.

    
                    
    configsetcluster-version 1.0.8 // Set the version of the cluster to 1.0.8
  • replication-modecontrols the replication mode of Regions in the dual data center scenario. SeeEnable the DR Auto-Sync modefor details.

  • leader-schedule-policyis used to select the scheduling strategy for the leader. You can schedule the leader according tosizeorcount.

  • scheduler-max-waiting-operatoris used to control the number of waiting operators in each scheduler.

  • enable-remove-down-replicais used to enable the feature of automatically deleting DownReplica. When you set it tofalse, PD does not automatically clean up the downtime replicas.

  • enable-replace-offline-replica是用于启用离线迁移的特点Replica. When you set it tofalse, PD does not migrate the offline replicas.

  • enable-make-up-replicais used to enable the feature of making up replicas. When you set it tofalse, PD does not add replicas for Regions without sufficient replicas.

  • enable-remove-extra-replicais used to enable the feature of removing extra replicas. When you set it tofalse, PD does not remove extra replicas for Regions with redundant replicas.

  • enable-location-replacementis used to enable the isolation level checking. When you set it tofalse, PD does not increase the isolation level of a Region replica through scheduling.

  • enable-debug-metricsis used to enable the metrics for debugging. When you set it totrue, PD enables some metrics such asbalance-tolerant-size.

  • enable-placement-rulesis used to enable placement rules, which is enabled by default in v5.0 and later versions.

  • store-limit-modeis used to control the mode of limiting the store speed. The optional modes areautoandmanual. Inautomode, the stores are automatically balanced according to the load (deprecated).

  • store-limit-versioncontrols the version of the store limit formula. In v1 mode, you can manually modify thestore limitto limit the scheduling speed of a single TiKV. The v2 mode is an experimental feature. In v2 mode, you do not need to manually set thestore limitvalue, as PD dynamically adjusts it based on the capability of TiKV snapshots. For more details, refer toPrinciples of store limit v2.

    
                    
    configsetstore-limit-version v2 // using storelimitv2
  • PD rounds the lowest digits of the flow number, which reduces the update of statistics caused by the changes of the Region flow information. This configuration item is used to specify the number of lowest digits to round for the Region flow information. For example, the flow100512will be rounded to101000because the default value is3. This configuration replacestrace-region-flow.

  • For example, set the value offlow-round-by-digitto4:

    
                    
    configsetflow-round-by-digit 4

config placement-rules [disable | enable | load | save | show | rule-group]

For the usage ofconfig placement-rules [disable | enable | load | save | show | rule-group], seeConfigure placement rules.

health

Use this command to view the health information of the cluster.

Usage:


              
>> health // Display the health information [ {"name":"pd","member_id": 13195394291058371180,"client_urls": ["http://127.0.0.1:2379"...... ],"health":true} ...... ]

hot [read | write | store| history [ ]]

Use this command to view the hot spot information of the cluster.

Usage:


              
>> hotread// Display hot spotforthereadoperation >> hot write // Display hot spotforthe write operation >> hot store // Display hot spotforall thereadand write operations >> hothistory1629294000000 1631980800000 // Displayhistoryhot spotforthe specified period (milliseconds). 1629294000000 is the start time and 1631980800000 is the end time. {"history_hot_region": [ {"update_time": 1630864801948,"region_id": 103,"peer_id": 1369002,"store_id": 3,"is_leader":true,"is_learner":false,"hot_region_type":"read","hot_degree": 152,"flow_bytes": 0,"key_rate": 0,"query_rate": 305,"start_key":“7480000000000000FF5300000000000000F8","end_key":“7480000000000000FF5600000000000000F8"}, ... ] } >> hothistory1629294000000 1631980800000 hot_region_typereadregion_id 1,2,3 store_id 1,2,3 peer_id 1,2,3 is_leadertrueis_learnertrue// Displayhistoryhotspotforthe specified period with more conditions {"history_hot_region": [ {"update_time": 1630864801948,"region_id": 103,"peer_id": 1369002,"store_id": 3,"is_leader":true,"is_learner":false,"hot_region_type":"read","hot_degree": 152,"flow_bytes": 0,"key_rate": 0,"query_rate": 305,"start_key":“7480000000000000FF5300000000000000F8","end_key":“7480000000000000FF5600000000000000F8"}, ... ] }

label [store ]

Use this command to view the label information of the cluster.

Usage:


              
>> label // Display all labels >> label store zone cn // Display all stores including the"zone":"cn"label

member [delete | leader_priority | leader [show | resign | transfer ]]

Use this command to view the PD members, remove a specified member, or configure the priority of leader.

Usage:


              
>> member // Display the information of all members {"header": {......},"members": [......],"leader": {......},"etcd_leader": {......}, } >> member delete name pd2 // Delete"pd2"Success! >> member deleteid1319539429105371180 // Delete a node usingidSuccess! >> member leader show // Display the leader information {"name":"pd","member_id": 13155432540099656863,"peer_urls": [......],"client_urls": [......] } >> member leader resign // Move leader away from the current member ...... >> member leader transfer pd3 // Migrate leader to a specified member ......

operator [check | show | add | remove]

Use this command to view and control the scheduling operation.

Usage:


              
>> operator show // Display all operators >> operator show admin // Display all admin operators >> operator show leader // Display all leader operators >> operator show region // Display all Region operators >> operator add add-peer 1 2 // Add a replica of Region 1 on store 2 >> operator add add-learner 1 2 // Add a learner replica of Region 1 on store 2 >> operator add remove-peer 1 2 // Remove a replica of Region 1 on store 2 >> operator add transfer-leader 1 2 // Schedule the leader of Region 1 to store 2 >> operator add transfer-region 1 2 3 4 // Schedule Region 1 to stores 2,3,4 >> operator add transfer-peer 1 2 3 // Schedule the replica of Region 1 on store 2 to store 3 >> operator add merge-region 1 2 // Merge Region 1 with Region 2 >> operator add split-region 1 --policy=approximate // Split Region 1 into two Regionsinhalves, based on approximately estimated value >> operator add split-region 1 --policy=scan // Split Region 1 into two Regionsinhalves, based on accurate scan value >> operator remove 1 // Remove the scheduling operation of Region 1 >> operator check 1 // Check the status of the operators related to Region 1

The splitting of Regions starts from the position as close as possible to the middle. You can locate this position using two strategies, namely "scan" and "approximate". The difference between them is that the former determines the middle key by scanning the Region, and the latter obtains the approximate position by checking the statistics recorded in the SST file. Generally, the former is more accurate, while the latter consumes less I/O and can be completed faster.

ping

Use this command to view the time thatpingPD takes.

Usage:


              
>> ping time: 43.12698ms

region [--jq=""]

Use this command to view the Region information. For a jq formatted output, seejq-formatted-json-output-usage.

Usage:


              
>> region // Display the information of all Regions {"count": 1,"regions": [......] } >> region 2 // Display the information of the Region with the ID of 2 {"id": 2,"start_key":“7480000000000000FF1D00000000000000F8","end_key":“7480000000000000FF1F00000000000000F8","epoch": {"conf_ver": 1,"version": 15 },"peers": [ {"id": 40,"store_id": 3 } ],"leader": {"id": 40,"store_id": 3 },"written_bytes": 0,"read_bytes": 0,"written_keys": 0,"read_keys": 0,"approximate_size": 1,"approximate_keys": 0 }

region key [--format=raw|encode|hex]

Use this command to query the Region that a specific key resides in. It supports the raw, encoding, and hex formats. And you need to use single quotes around the key when it is in the encoding format.

Hex format usage (default):


              
>> region key 7480000000000000FF1300000000000000F8 {"region": {"id": 2, ...... } }

Raw format usage:


              
>> region key --format=raw abc {"region": {"id": 2, ...... } }

Encoding format usage:


              
>> region key --format=encode't\200\000\000\000\000\000\000\377\035_r\200\000\000\000\000\377\017U\320\000\000\000\000\000\372'{"region": {"id": 2, ...... } }

region scan

Use this command to get all Regions.

Usage:


              
>> region scan {"count": 20,"regions": [......], }

region sibling

Use this command to check the adjacent Regions of a specific Region.

Usage:


              
>> region sibling 2 {"count": 2,"regions": [......], }

region keys [--format=raw|encode|hex]

Use this command to query all Regions in a given range[startkey, endkey). Ranges withoutendKeys are supported.

Thelimitparameter limits the number of keys. The default value oflimitis16, and the value of-1means unlimited keys.

Usage:


              
>> region keys --format=raw a // Display all Regions that start from the key a with a defaultlimitcount of 16 {"count": 16,"regions": [......], } >> region keys --format=raw a z // Display all Regionsinthe range [a, z) with a defaultlimitcount of 16 {"count": 16,"regions": [......], } >> region keys --format=raw a z -1 // Display all Regionsinthe range [a, z) without alimitcount {"count": ...,"regions": [......], } >> region keys --format=raw a""20 // Display all Regions that start from the key a with alimitcount of 20 {"count": 20,"regions": [......], }

region store

Use this command to list all Regions of a specific store.

Usage:


              
>> region store 2 {"count": 10,"regions": [......], }

region topread [limit]

Use this command to list Regions with top read flow. The default value of the limit is 16.

Usage:


              
>> region topread {"count": 16,"regions": [......], }

region topwrite [limit]

Use this command to list Regions with top write flow. The default value of the limit is 16.

Usage:


              
>> region topwrite {"count": 16,"regions": [......], }

region topconfver [limit]

Use this command to list Regions with top conf version. The default value of the limit is 16.

Usage:


              
>> region topconfver {"count": 16,"regions": [......], }

region topversion [limit]

Use this command to list Regions with top version. The default value of the limit is 16.

Usage:


              
>> region topversion {"count": 16,"regions": [......], }

region topsize [limit]

Use this command to list Regions with top approximate size. The default value of the limit is 16.

Usage:


              
>> region topsize {"count": 16,"regions": [......], }

region check [miss-peer | extra-peer | down-peer | pending-peer | offline-peer | empty-region | hist-size | hist-keys] [--jq=""]

Use this command to check the Regions in abnormal conditions. For a jq formatted output, seejq formatted JSON output usage.

Description of various types:

  • miss-peer: the Region without enough replicas
  • extra-peer: the Region with extra replicas
  • down-peer: the Region in which some replicas are Down
  • pending-peer: the Region in which some replicas are Pending

Usage:


              
>> region check miss-peer {"count": 2,"regions": [......], }

scheduler [show | add | remove | pause | resume | config | describe]

Use this command to view and control the scheduling policy.

Usage:


              
> >调度器显示/ /显示所有创建的调度器s >> scheduler add grant-leader-scheduler 1 // Schedule all the leaders of the Regions on store 1 to store 1 >> scheduler add evict-leader-scheduler 1 // Move all the Region leaders on store 1 out >> scheduler config evict-leader-scheduler // Display the storesin whichthe scheduler is located since v4.0.0 >> scheduler add shuffle-leader-scheduler // Randomly exchange the leader on different stores >> scheduler add shuffle-region-scheduler // Randomly scheduling the Regions on different stores >> scheduler add evict-slow-store-scheduler // When there is one and only one slow store, evict all Region leaders of that store >> scheduler remove grant-leader-scheduler-1 // Remove the corresponding scheduler, and `-1` corresponds to the store ID >> scheduler pause balance-region-scheduler 10 // Pause the balance-region schedulerfor10 seconds >> scheduler pause all 10 // Pause all schedulersfor10 seconds >> scheduler resume balance-region-scheduler // Continue to run the balance-region scheduler >> scheduler resume all // Continue to run all schedulers >> scheduler config balance-hot-region-scheduler // Display the configuration of the balance-hot-region scheduler >> scheduler describe balance-region-scheduler // Display the running state and related diagnostic information of the balance-region scheduler

scheduler describe balance-region-scheduler

Use this command to view the running state and related diagnostic information of thebalance-region-scheduler.

Since TiDB v6.3.0, PD provides the running state and brief diagnostic information forbalance-region-schedulerandbalance-leader-scheduler. Other schedulers and checkers are not supported yet. To enable this feature, you can modify theenable-diagnosticconfiguration item usingpd-ctl.

The state of the scheduler can be one of the following:

  • disabled: the scheduler is unavailable or removed.
  • paused: the scheduler is paused.
  • scheduling: the scheduler is generating scheduling operators.
  • pending: the scheduler cannot generate scheduling operators. For a scheduler in thependingstate, brief diagnostic information is returned. The brief information describes the state of stores and explains why these stores cannot be selected for scheduling.
  • normal: there is no need to generate scheduling operators.

scheduler config balance-leader-scheduler

Use this command to view and control thebalance-leader-schedulerpolicy.

Since TiDB v6.0.0, PD introduces theBatchparameter forbalance-leader-schedulerto control the speed at which the balance-leader processes tasks. To use this parameter, you can modify thebalance-leader batchconfiguration item using pd-ctl.

Before v6.0.0, PD does not have this configuration item, which meansbalance-leader batch=1. In v6.0.0 or later versions, the default value ofbalance-leader batchis4. To set this configuration item to a value greater than4, you need to set a greater value forscheduler-max-waiting-operator(whose default value is5) at the same time. You can get the expected acceleration effect only after modifying both configuration items.


              
scheduler config balance-leader-schedulersetbatch 3 // Set the size of the operator that the balance-leader scheduler can executeina batch to 3

scheduler config balance-hot-region-scheduler

Use this command to view and control thebalance-hot-region-schedulerpolicy.

Usage:


              
>> scheduler config balance-hot-region-scheduler // Display all configuration of the balance-hot-region scheduler {"min-hot-byte-rate": 100,"min-hot-key-rate": 10,"min-hot-query-rate": 10,"max-zombie-rounds": 3,"max-peer-number": 1000,"byte-rate-rank-step-ratio": 0.05,"key-rate-rank-step-ratio": 0.05,"query-rate-rank-step-ratio": 0.05,"count-rank-step-ratio": 0.01,"great-dec-ratio": 0.95,"minor-dec-ratio": 0.99,"src-tolerance-ratio": 1.05,"dst-tolerance-ratio": 1.05,"read-priorities": ["query","byte"],"write-leader-priorities": ["key","byte"],"write-peer-priorities": ["byte","key"],"strict-picking-store":"true","enable-for-tiflash":"true","rank-formula-version":"v2"}
  • min-hot-byte-ratemeans the smallest number of bytes to be counted, which is usually 100.

    
                    
    scheduler config balance-hot-region-schedulersetmin-hot-byte-rate100
  • min-hot-key-ratemeans the smallest number of keys to be counted, which is usually 10.

    
                    
    scheduler config balance-hot-region-schedulersetmin-hot-key-rate 10
  • min-hot-query-ratemeans the smallest number of queries to be counted, which is usually 10.

    
                    
    scheduler config balance-hot-region-schedulersetmin-hot-query-rate 10
  • max-zombie-roundsmeans the maximum number of heartbeats with which an operator can be considered as the pending influence. If you set it to a larger value, more operators might be included in the pending influence. Usually, you do not need to adjust its value. Pending influence refers to the operator influence that is generated during scheduling but still has an effect.

    
                    
    scheduler config balance-hot-region-schedulersetmax-zombie-rounds 3
  • max-peer-numbermeans the maximum number of peers to be solved, which prevents the scheduler from being too slow.

    
                    
    scheduler config balance-hot-region-schedulersetmax-peer-number 1000
  • byte-rate-rank-step-ratio,key-rate-rank-step-ratio,query-rate-rank-step-ratio, andcount-rank-step-ratiorespectively mean the step ranks of byte, key, query, and count. The rank-step-ratio decides the step when the rank is calculated.great-dec-ratioandminor-dec-ratioare used to determine thedecrank. Usually, you do not need to modify these items.

    
                    
    scheduler config balance-hot-region-schedulersetbyte-rate-rank-step-ratio 0.05
  • src-tolerance-ratioanddst-tolerance-ratioare configuration items for the expectation scheduler. The smaller thetolerance-ratio, the easier it is for scheduling. When redundant scheduling occurs, you can appropriately increase this value.

    
                    
    scheduler config balance-hot-region-schedulersetsrc-tolerance-ratio 1.1
  • read-priorities,write-leader-priorities, andwrite-peer-prioritiescontrol which dimension the scheduler prioritizes for hot Region scheduling. Two dimensions are supported for configuration.

    • read-prioritiesandwrite-leader-prioritiescontrol which dimensions the scheduler prioritizes for scheduling hot Regions of the read and write-leader types. The dimension options arequery,byte, andkey.

    • write-peer-prioritiescontrols which dimensions the scheduler prioritizes for scheduling hot Regions of the write-peer type. The dimension options arebyteandkey.

      
                        
      scheduler config balance-hot-region-schedulersetread-priorities query,byte
  • strict-picking-storecontrols the search space of hot Region scheduling. Usually, it is enabled. This configuration item only affects the behavior whenrank-formula-versionisv1. When it is enabled, hot Region scheduling ensures hot Region balance on the two configured dimensions. When it is disabled, hot Region scheduling only ensures the balance on the dimension with the first priority, which might reduce balance on other dimensions. Usually, you do not need to modify this configuration.

    
                    
    scheduler config balance-hot-region-schedulersetstrict-picking-storetrue
  • rank-formula-versioncontrols which scheduler algorithm version is used in hot Region scheduling. Value options arev1andv2. The default value isv2.

    • Thev1algorithm is the scheduler strategy used in TiDB v6.3.0 and earlier versions. This algorithm mainly focuses on reducing load difference between stores and avoids introducing side effects in the other dimension.
    • Thev2algorithm is an experimental scheduler strategy introduced in TiDB v6.3.0 and is in General Availability (GA) in TiDB v6.4.0. This algorithm mainly focuses on improving the rate of the equitability between stores and factors in few side effects. Compared with thev1algorithm withstrict-picking-storebeingtrue, thev2algorithm pays more attention to the priority equalization of the first dimension. Compared with thev1algorithm withstrict-picking-storebeingfalse, thev2algorithm considers the balance of the second dimension.
    • Thev1algorithm withstrict-picking-storebeingtrueis conservative and scheduling can only be generated when there is a store with a high load in both dimensions. In certain scenarios, it might be impossible to continue balancing due to dimensional conflicts. To achieve better balancing in the first dimension, it is necessary to set thestrict-picking-storetofalse. Thev2algorithm can achieve better balancing in both dimensions and reduce invalid scheduling.
    
                    
    scheduler config balance-hot-region-schedulersetrank-formula-version v2
  • enable-for-tiflashcontrols whether hot Region scheduling takes effect for TiFlash instances. Usually, it is enabled. When it is disabled, the hot Region scheduling between TiFlash instances is not performed.

    
                    
    scheduler config balance-hot-region-schedulersetenable-for-tiflashtrue

service-gc-safepoint

使用此命令来查询当前GC safepointand service GC safepoint. The output is as follows:


              
{"service_gc_safe_points": [ {"service_id":"gc_worker","expired_at": 9223372036854775807,"safe_point": 439923410637160448 } ],"gc_safe_point": 0 }

存储(删除| cancel-delete | | |标签重量remove-tombstone | limit ] [--jq=""]

For a jq formatted output, seejq-formatted-json-output-usage.

Get a store

To display the information of all stores, run the following command:


              
store

              
{ "count": 3, "stores": [...] }

To get the store with id of 1, run the following command:


              
store 1

              
......

Delete a store

To delete the store with id of 1, run the following command:


              
store delete 1

To cancel deletingOfflinestate stores which are deleted usingstore delete, run thestore cancel-deletecommand. After canceling, the store changes fromOfflinetoUp. Note that thestore cancel-deletecommand cannot change aTombstonestate store to theUpstate.

To cancel deleting the store with id of 1, run the following command:


              
store cancel-delete 1

To delete all stores inTombstonestate, run the following command:


              
store remove-tombstone

Manage store labels

To manage the labels of a store, run thestore labelcommand.

  • To set a label with the key being"zone"and value being"cn"to the store with id of 1, run the following command:

    
                    
    store label 1 zone=cn
  • To update the label of a store, for example, changing the value of the key"zone"from"cn"to"us"for the store with id of 1, run the following command:

    
                    
    store label 1 zone=us
  • To rewrite all labels of a store with id of 1, use the--rewriteoption. Note that this option overwrites all existing labels:

    
                    
    store label 1 region=us-est-1 disk=ssd --rewrite
  • To delete the"disk"label for the store with id of 1, use the--deleteoption:

    
                    
    store label 1 disk --delete

Configure store weight

To set the leader weight to 5 and Region weight to 10 for the store with id of 1, run the following command:


              
store weight 1 5 10

Configure store scheduling speed

You can set the scheduling speed of stores by usingstore limit. For more details about the principles and usage ofstore limit, seestore limit.


              
>> storelimit// Show the speedlimitof adding-peer operations and thelimitof removing-peer operations per minuteinall stores >> storelimitadd-peer // Show the speedlimitof adding-peer operations per minuteinall stores >> storelimitremove-peer // Show thelimitof removing-peer operations per minuteinall stores >> storelimitall 5 // Set thelimitof adding-peer operations to 5 and thelimitof removing-peer operations to 5 per minuteforall stores >> storelimit1 5 // Set thelimitof adding-peer operations to 5 and thelimitof removing-peer operations to 5 per minuteforstore 1 >> storelimitall 5 add-peer // Set thelimitof adding-peer operations to 5 per minuteforall stores >> storelimit1 5 add-peer // Set thelimitof adding-peer operations to 5 per minuteforstore 1 >> storelimit1 5 remove-peer // Set thelimitof removing-peer operations to 5 per minuteforstore 1 >> storelimitall 5 remove-peer // Set thelimitof removing-peer operations to 5 per minuteforall stores

log [fatal | error | warn | info | debug]

Use this command to set the log level of the PD leader.

Usage:


              
logwarn

tso

Use this command to parse the physical and logical time of TSO.

Usage:


              
>> tso 395181938313123110 // Parse TSO system: 2017-10-09 05:50:59 +0800 CST logic: 120102

unsafe remove-failed-stores [store-ids | show]

Use this command to perform lossy recovery operations when permanently damaged replicas cause data to be unavailable. See the following example. The details are described inOnline Unsafe Recovery

Execute Online Unsafe Recovery to remove permanently damaged stores:


              
unsafe remove-failed-stores 101,102,103

              
Success!

Show the current or historical state of Online Unsafe Recovery:


              
unsafe remove-failed-stores show

              
["Collecting cluster info from all alive stores, 10/12.","Stores that have reports to PD: 1, 2, 3, ...","Stores that have not reported to PD: 11, 12", ]

Jq formatted JSON output usage

Simplify the output ofstore


              
>> store --jq=".stores[].store | { id, address, state_name}"{"id":1,"address":"127.0.0.1:20161","state_name":"Up"} {"id":30,"address":"127.0.0.1:20162","state_name":"Up"} ...

Query the remaining space of the node


              
>> store --jq=".stores[] | {id: .store.id, available: .status.available}"{"id":1,"available":"10 GiB"} {"id":30,"available":"10 GiB"} ...

Query all nodes whose status is notUp


              
store --jq='.stores[].store | select(.state_name!="Up") | { id, address, state_name}'

              
{"id":1,"address":"127.0.0.1:20161""state_name":"Offline"} {"id":5,"address":"127.0.0.1:20162""state_name":"Offline"} ...

Query all TiFlash nodes


              
store --jq='.stores[].store | select(.labels | length>0 and contains([{"key":"engine","value":"tiflash"}])) | { id, address, state_name}'

              
{"id":1,"address":"127.0.0.1:20161""state_name":"Up"} {"id":5,"address":"127.0.0.1:20162""state_name":"Up"} ...

Query the distribution status of the Region replicas


              
>> region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id]}"{"id":2,“peer_stores":[1,30,31]} {"id":4,“peer_stores":[1,31,34]} ...

Filter Regions according to the number of replicas

For example, to filter out all Regions whose number of replicas is not 3:


              
>> region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length != 3)}"{"id":12,“peer_stores":[30,32]} {"id":2,“peer_stores":[1,30,31,32]}

Filter Regions according to the store ID of replicas

For example, to filter out all Regions that have a replica on store30:


              
>> region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(any(.==30))}"{"id":6,“peer_stores":[1,30,31]} {"id":22,“peer_stores":[1,30,32]} ...

You can also find out all Regions that have a replica on store30 or store31 in the same way:


              
>> region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(any(.==(30,31)))}"{"id":16,“peer_stores":[1,30,34]} {"id":28,“peer_stores":[1,30,32]} {"id":12,“peer_stores":[30,32]} ...

Look for relevant Regions when restoring data

For example, when[store1, store30, store31]is unavailable at its downtime, you can find all Regions whose Down replicas are more than normal replicas:


              
>> region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as$total| map(if .==(1,30,31) then . else empty end) | length>=$total-length) }"{"id":2,“peer_stores":[1,30,31,32]} {"id":12,“peer_stores":[30,32]} {"id":14,“peer_stores":[1,30,32]} ...

Or when[store1, store30, store31]fails to start, you can find Regions where the data can be manually removed safely on store1. In this way, you can filter out all Regions that have a replica on store1 but don't have other DownPeers:


              
>> region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length>1 and any(.==1) and all(.!=(30,31)))}"{"id":24,“peer_stores":[1,32,33]}

When[store30, store31]is down, find out all Regions that can be safely processed by creating theremove-peerOperator, that is, Regions with one and only DownPeer:


              
>> region --jq=".regions[] | {id: .id, remove_peer: [.peers[].store_id] | select(length>1) | map(if .==(30,31) then . else empty end) | select(length==1)}"{"id":12,"remove_peer":[30]} {"id":4,"remove_peer":[31]} {"id":22,"remove_peer":[30]} ...
Download PDF Request docs changes Ask questions on Discord
Playground
New
One-stop & interactive experience of TiDB's capabilities WITHOUT registration.
Was this page helpful?
Products
TiDB
TiDB Dedicated
TiDB Serverless
Pricing
Get Demo
Get Started
©2023PingCAP. All Rights Reserved.