Migrate MySQL-Compatible Databases to TiDB Cloud Using Data Migration
本文档描述了如何迁移数据MySQL-compatible database on a cloud provider (Amazon Aurora MySQL, Amazon Relational Database Service (RDS), or Google Cloud SQL for MySQL) or self-hosted source database to TiDB Cloud using the Data Migration feature of the TiDB Cloud console.
This feature helps you migrate your source databases' existing data and ongoing changes to TiDB Cloud (either in the same region or cross regions) directly in one go.
If you want to migrate incremental data only, seeMigrate Incremental Data from MySQL-Compatible Databases to TiDB Cloud Using Data Migration.
Limitations
The Data Migration feature is available only forTiDB Dedicatedclusters.
The Data Migration feature is only available to clusters that are created incertain regionsafter November 9, 2022. If yourprojectwas created before the date or if your cluster is in another region, this feature is not available to your cluster and theData Migrationtab will not be displayed on the cluster overview page in the TiDB Cloud console.
同时支持亚马逊极光MySQL作家实例existing data and incremental data migration. Amazon Aurora MySQL reader instances only support existing data migration and do not support incremental data migration.
You can create up to 200 migration jobs for each organization. To create more migration jobs, you need tofile a support ticket.
The system databases will be filtered out and not migrated to TiDB Cloud even if you select all of the databases to migrate. That is,
mysql
,information_schema
,information_schema
, andsys
will not be migrated using this feature.During existing data migration, if the table to be migrated already exists in the target database with duplicated keys, the duplicate keys will be replaced.
During incremental data migration, if the table to be migrated already exists in the target database with duplicated keys, an error is reported and the migration is interrupted. In this situation, you need to make sure whether the upstream data is accurate. If yes, click the "Restart" button of the migration job and the migration job will replace the downstream conflicting records with the upstream records.
When you delete a cluster in TiDB Cloud, all migration jobs in that cluster are automatically deleted and not recoverable.
During incremental replication (migrating ongoing changes to your cluster), if the migration job recovers from an abrupt error, it might open the safe mode for 60 seconds. During the safe mode,
INSERT
statements are migrated asREPLACE
,UPDATE
statements asDELETE
andREPLACE
, and then these transactions are migrated to the downstream cluster to make sure that all the data during the abrupt error has been migrated smoothly to the downstream cluster. In this scenario, for upstream tables without primary keys or not-null unique indexes, some data might be duplicated in the downstream cluster because the data might be inserted repeatedly to the downstream.When you use Data Migration, it is recommended to keep the size of your dataset smaller than 1 TiB. If the dataset size is larger than 1 TiB, the existing data migration will take a long time due to limited specifications.
In the following scenarios, if the migration job takes longer than 24 hours, do not purge binary logs in the source database to ensure that Data Migration can get consecutive binary logs for incremental replication:
- During existing data migration.
- After the existing data migration is completed and when incremental data migration is started for the first time, the latency is not 0ms.
Prerequisites
Before performing the migration, you need to check the data sources, prepare privileges for upstream and downstream databases, and set up network connections.
Make sure your data source and version are supported
Data Migration supports the following data sources and versions:
- MySQL 5.6, 5.7, and 8.0 local instances or on a public cloud provider. Note that MySQL 8.0 is still experimental on TiDB Cloud and might have incompatibility issues.
- 亚马逊极光(MySQL 5.6和5.7)
- Amazon RDS (MySQL 5.7)
- Google Cloud SQL for MySQL 5.6 and 5.7
Grant required privileges to the upstream database
The username you use for the upstream database must have all the following privileges:
Privilege | Scope |
---|---|
SELECT |
Tables |
LOCK |
Tables |
REPLICATION SLAVE |
Global |
REPLICATION CLIENT |
Global |
For example, you can use the followingGRANT
statement to grant corresponding privileges:
GRANT
SELECT,LOCK TABLES,REPLICATION SLAVE,REPLICATION CLIENTON
*.*
TO
'your_user'@'your_IP_address_of_host'
Grant required privileges to the downstream TiDB Cloud cluster
The username you use for the downstream TiDB Cloud cluster must have the following privileges:
Privilege | Scope |
---|---|
CREATE |
Databases, Tables |
SELECT |
Tables |
INSERT |
Tables |
UPDATE |
Tables |
DELETE |
Tables |
ALTER |
Tables |
DROP |
Databases, Tables |
INDEX |
Tables |
TRUNCATE |
Tables |
For example, you can execute the followingGRANT
statement to grant corresponding privileges:
GRANT
CREATE,SELECT,INSERT,UPDATE,DELETE,ALTER,TRUNCATE,DROP,INDEXON
*.*
TO
'your_user'@'your_IP_address_of_host'
To quickly test a migration job, you can use theroot
account of the TiDB Cloud cluster.
Set up network connection
Before creating a migration job, set up the network connection according to your connection methods. SeeConnect to Your TiDB Dedicated Cluster.
If you use public IP (this is, standard connection) for network connection, make sure that the upstream database can be connected through the public network.
If you use AWS PrivateLink, set it up according toConnect to TiDB Dedicated via Private Endpoint with AWS.
If you use Google Cloud Private Service Connect, set it up according toConnect to TiDB Dedicated via Private Endpoint with Google Cloud.
If you use AWS VPC Peering or Google Cloud VPC Network Peering, see the following instructions to configure the network.
Set up AWS VPC Peering
If your MySQL service is in an AWS VPC, take the following steps:
Set up a VPC peering connectionbetween the VPC of the MySQL service and your TiDB cluster.
Modify the inbound rules of the security group that the MySQL service is associated with.
You must addthe CIDR of the region where your TiDB Cloud cluster is locatedto the inbound rules. Doing so allows the traffic to flow from your TiDB cluster to the MySQL instance.
If the MySQL URL contains a DNS hostname, you need to allow TiDB Cloud to be able to resolve the hostname of the MySQL service.
- Follow the steps inEnable DNS resolution for a VPC peering connection.
- Enable theAccepter DNS resolutionoption.
Set up Google Cloud VPC Network Peering
If your MySQL service is in a Google Cloud VPC, take the following steps:
If it is a self-hosted MySQL, you can skip this step and proceed to the next step. If your MySQL service is Google Cloud SQL, you must expose a MySQL endpoint in the associated VPC of the Google Cloud SQL instance. You might need to use theCloud SQL Auth proxydeveloped by Google.
Set up a VPC peering connectionbetween the VPC of your MySQL service and your TiDB cluster.
Modify the ingress firewall rules of the VPC where MySQL is located.
You must addthe CIDR of the region where your TiDB Cloud cluster is locatedto the ingress firewall rules. This allows the traffic to flow from your TiDB cluster to the MySQL endpoint.
Enable binary logs
To perform incremental data migration, make sure you have enabled binary logs of the upstream database, and the binary logs have been kept for more than 24 hours.
Step 1: Go to theData Migrationpage
Log in to theTiDB Cloud consoleand navigate to theClusterspage of your project.
Click the name of your target cluster to go to its overview page, and then clickData Migrationin the left navigation pane.
On theData Migrationpage, clickCreate Migration Jobin the upper-right corner. TheCreate Migration Jobpage is displayed.
Step 2: Configure the source and target connection
On theCreate Migration Jobpage, configure the source and target connection.
Enter a job name, which must start with a letter and must be less than 60 characters. Letters (A-Z, a-z), numbers (0-9), underscores (_), and hyphens (-) are acceptable.
Fill in the source connection profile.
- Data source: the data source type.
- Region: the region of the data source, which is required for cloud databases only.
- Connectivity method: the connection method for the data source. Currently, you can choose public IP, VPC Peering, or Private Link according to your connection method.
- Hostname or IP address(for public IP and VPC Peering): the hostname or IP address of the data source.
- Service Name(for Private Link): the endpoint service name.
- Port: the port of the data source.
- Username: the username of the data source.
- Password: the password of the username.
- SSL/TLS: if you enable SSL/TLS, you need to upload the certificates of the data source, including any of the following:
- only the CA certificate
- the client certificate and client key
- the CA certificate, client certificate and client key
Fill in the target connection profile.
- Username: enter the username of the target cluster in TiDB Cloud.
- Password: enter the password of the TiDB Cloud username.
ClickValidate Connection and Nextto validate the information you have entered.
Take action according to the message you see:
- If you use Public IP or VPC Peering, you need to add the Data Migration service's IP addresses to the IP Access List of your source database and firewall (if any).
- If you use AWS Private Link, you are prompted to accept the endpoint request. Go to theAWS VPC console, and clickEndpoint servicesto accept the endpoint request.
Step 3: Choose migration job type
In theChoose the objects to be migratedstep, you can choose existing data migration, incremental data migration, or both.
Migrate existing data and incremental data
To migrate data to TiDB Cloud once and for all, choose bothExisting data migrationandIncremental data migration, which ensures data consistency between the source and target databases.
Migrate only existing data
To migrate only existing data of the source database to TiDB Cloud, chooseExisting data migration.
Migrate only incremental data
To migrate only the incremental data of the source database to TiDB Cloud, chooseIncremental data migration. In this case, the migration job does not migrate the existing data of the source database to TiDB Cloud, but only migrates the ongoing changes of the source database that are explicitly specified by the migration job.
For detailed instructions about incremental data migration, seeMigrate Only Incremental Data from MySQL-Compatible Databases to TiDB Cloud Using Data Migration.
Step 4: Choose the objects to be migrated
On theChoose Objects to Migratepage, select the objects to be migrated. You can clickAllto select all objects, or clickCustomizeand then click the checkbox next to the object name to select the object.
If you clickAll, the migration job will migrate the existing data from the whole source database instance to TiDB Cloud and migrate ongoing changes after the full migration. Note that it happens only if you have selected theExisting data migrationandIncremental data migrationcheckboxes in the previous step.
If you clickCustomizeand select some databases, the migration job will migrate the existing data and migrate ongoing changes of the selected databases to TiDB Cloud. Note that it happens only if you have selected theExisting data migrationandIncremental data migrationcheckboxes in the previous step.
If you clickCustomizeand select some tables under a dataset name, the migration job only will migrate the existing data and migrate ongoing changes of the selected tables. Tables created afterwards in the same database will not be migrated.
ClickNext.
Step 5: Precheck
On thePrecheckpage, you can view the precheck results. If the precheck fails, you need to operate according toFailedorWarningdetails, and then clickCheck againto recheck.
If there are only warnings on some check items, you can evaluate the risk and consider whether to ignore the warnings. If all warnings are ignored, the migration job will automatically go on to the next step.
For more information about errors and solutions, seePrecheck errors and solutions.
For more information about precheck items, seeMigration Task Precheck.
If all check items showPass, clickNext.
Step 6: Choose a spec and start migration
On theChoose a Spec and Start Migrationpage, select an appropriate migration specification according to your performance requirements. For more information about the specifications, seeSpecifications for Data Migration.
After selecting the spec, clickCreate Job and Startto start the migration.
Step 7: View the migration progress
After the migration job is created, you can view the migration progress on theMigration Job Detailspage. The migration progress is displayed in theStage and Statusarea.
You can pause or delete a migration job when it is running.
If a migration job has failed, you can resume it after solving the problem.
You can delete a migration job in any status.
If you encounter any problems during the migration, seeMigration errors and solutions.
Scale a migration job specification
TiDB Cloud supports scaling up or down a migration job specification to meet your performance and cost requirements in different scenarios.
不同的规格有不同的迁移performances. Your performance requirements might vary at different stages as well. For example, during the existing data migration, you want the performance to be as fast as possible, so you choose a migration job with a large specification, such as 8 RCU. Once the existing data migration is completed, the incremental migration does not require such a high performance, so you can scale down the job specification, for example, from 8 RCU to 2 RUC, to save cost.
When scaling a migration job specification, note the following:
- It takes about 5 to 10 minutes to scale a migration job specification.
- If the scaling fails, the job specification remains the same as it was before the scaling.
Limitations
- You can only scale a migration job specification when the job is in theRunningorPausedstatus.
- TiDB Cloud does not support scaling a migration job specification during the existing data export stage.
- Scaling a migration job specification will restart the job. If a source table of the job does not have a primary key, duplicate data might be inserted.
- During scaling, do not purge the binary log of the source database or increase
expire_logs_days
of the upstream database temporarily. Otherwise, the job might fail because it cannot get the continuous binary log position.
Scaling procedure
Log in to theTiDB Cloud consoleand navigate to theClusterspage of your project.
Click the name of your target cluster to go to its overview page, and then clickData Migrationin the left navigation pane.
On theData Migrationpage, locate the migration job you want to scale. In theActioncolumn, click...>Scale Up/Down.
In theScale Up/Downwindow, select the new specification you want to use, and then clickSubmit. You can view the new price of the specification at the bottom of the window.