tencent cloud

Feedback

Data Sync Guide

Last updated: 2024-07-08 19:02:56
    DTS allows you to sync the full and incremental data of the source database to Ckafka, so that you can quickly obtain business change data and use it. This document describes how to use DTS to sync data from TDSQL for MySQL to Ckafka.
    Currently, TDSQL for MySQL is the only supported source database type.

    Prerequisites

    The source and target databases must meet the requirements for the sync feature and version as instructed in Databases Supported by Data Sync.
    Source database permissions required for the sync task account:
    GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT, REPLICATION SLAVE, SELECT ON *.* TO 'migration account'@'%' IDENTIFIED BY 'migration password';
    FLUSH PRIVILEGES;
    You need to modify the message retention period and message size limit in target Ckafka.
    We recommend that you set the message retention period to 3 days. The data beyond the retention period will be cleared, so you need to consume data in time within the set period. The upper limit for message size refers to the maximum size of a single message that Ckafka can receive. You must set it to be greater than the maximum size of a single row of data in the source database table so that data can be normally delivered to CKafka.

    Directions

    1. Log in to the data sync task purchase page, select appropriate configuration items, and click Buy Now.
    Parameter
    Description
    Billing Mode
    Monthly subscription and pay-as-you-go billing modes are supported.
    Source Instance Type
    Select TDSQL for MySQL, which cannot be changed after purchase.
    Source Instance Region
    Select the source instance region, which cannot be changed after purchase.
    Target Instance Type
    Select Kafka, which cannot be changed after purchase.
    Target Instance Region
    Select the target instance region, which cannot be changed after purchase.
    Specification
    Select a specification based on your business needs. The higher the specification, the higher the performance. For more information, see Billing Overview.
    2. After making the purchase, return to the data sync task list to view the task you just created.Then, click Configure in the Operation column to enter the Configure Sync Task page.
    3. On the Configure Sync Task page, configure Instance ID, Account, and Password for the source instance, configure Instance ID for the target instance, test connectivity, and click Next.
    Setting Items
    Parameter
    Description
    Task Configuration
    Task Name
    DTS will automatically generate a task name, which is customizable.
    Running Mode
    Immediate execution and scheduled execution are supported.
    Source Instance Settings
    Source Instance Type
    The source database type selected during purchase, which cannot be changed.
    Source Instance Region
    The source instance region selected during purchase, which cannot be changed.
    Access Type
    Select a type based on your scenario. In this scenario, you can only select Database.
    Account/Password
    Account/Password: Enter the source database account and password.
    Target Instance Settings
    Target Instance Type
    The target instance type selected during purchase, which cannot be changed.
    Target Instance Region
    The target instance region selected during purchase, which cannot be changed.
    Access Type
    Select a type based on your scenario. In this scenario, select CKafka instance.
    Instance ID
    Select the instance ID of the target instance.
    4. On the Set sync options and objects page, set the following items: Data Initialization Option, Policy for Syncing Data to Kafka, Data Sync Option, and Sync Object Option. Then click Save and Go Next.
    Deliver to custom topic
    Deliver to a single topic
    Setting Items
    Parameter
    Description
    Data Initialization Option
    Initialization Type
    Structure initialization: Table structures in the source instance will be initialized into the target instance before the sync task runs.
    Full data initialization: Data in the source instance will be initialized into the target instance before the sync task runs. If you select Full data initialization only, you need to create the table structures in the target database in advance.
    Both options are selected by default, and you can deselect them as needed.
    Format of Data Delivered to Kafka
    Avro adopts the binary format with a higher consumption efficiency, while JSON adopts the easier-to-use lightweight text format.
    Policy for Syncing Data to Kafka
    Topic Sync Policy
    Deliver to custom topic: Customize the topic name for delivery. After that, the target Kafka will automatically create a topic with the custom name. The synced data is randomly delivered to different partitions under the topic. If the target Kafka fails to create the topic, the task will report an error.
    Deliver to a single topic: Select an existing topic on the target side, and then deliver data based on multiple partitioning policies. Data can be delivered to a single partition of the specified topic, or delivered to different partitions by table name or by table name + primary key.
    Rules for delivering to custom topic
    If you add multiple rules, the database and table rules are matched one by one from top to bottom. If no rules are matched, data will be delivered to the topic corresponding to the last rule. If multiple rules are matched, data will be delivered to the topics corresponding to all the matched rules.
    Example 1: There are tables named "Student" and "Teacher" in a database named "Users" on database instance X. If you need to deliver the data in the "Users" database to a topic named "Topic_A". The rules are configured as follows:
    Enter Topic_A for Topic Name, ^Users$ for Database Name Match, and .* for Table Name Match.
    Enter Topic_default for Topic Name, Databases that don't match the above rules for Database Name Match, and Tables that don't match the above rules for Table Name Match.
    Example 1: There are tables named "Student" and "Teacher" in a database named "Users" on database instance X. If you need to deliver the data in the "Student" table and "Teacher" tables to topics named "Topic_A" and "Topic_default" respectively. The rules are configured as follows:
    Enter Topic_A for Topic Name, ^Users$ for Database Name Match, and ^Student$ for Table Name Match.
    Enter Topic_default for Topic Name, Databases that don't match the above rules for Database Name Match, and Tables that don't match the above rules for Table Name Match.
    Rules for delivering to a single topic
    After selecting a specified topic, the system will perform partitioning based on the specified policy as follows.
    Deliver all data to partition 0: Deliver all the synced data of the source database to the first partition.
    By table name: Partition the synced data from the source database by table name. After setting, the data with the same table name will be written into the same partition.
    By table name + primary key: Partition the synced data from the source database by table name and primary key. This policy is suitable for frequently accessed data. After settings, frequently accessed data is distributed from tables to different partitions by table name and primary key, so as to improve the concurrent consumption efficiency.
    Topic for DDL Storage
    (Optional) If you need to deliver the DDL operation of the source database to the specified topic separately, you can select the settings here. After setting, it will be delivered to Partition 0 of the selected topic by default; if not set, it will be delivered based on the topic rules selected above.
    Data Sync Option
    Setting Items
    Parameter
    Description
    Data Sync Option
    SQL Type
    The following operations are supported: INSERT, DELETE, UPDATE, and DDL.
    Sync Object Option
    Database and Table Objects of Source Instance
    Only the database/table objects can be synced.
    5. On the task verification page, complete the verification. After all check items are passed, click Start Task. If the verification fails, fix the problem as instructed in Check Item Overview and initiate the verification again.
    Failed: It indicates that a check item fails and the task is blocked. You need to fix the problem and run the verification task again.
    Alarm: It indicates that a check item doesn't completely meet the requirements, and the task can be continued, but the business will be affected. You need to assess whether to ignore the alarm or fix the problem and continue the task based on the alarm message.
    6. Return to the data sync task list, and you can see that the task has entered the Running status.
    Note
    You can click More > Stop in the Operation column to stop a sync task. Before doing so, ensure that data sync has been completed.
    7. (Optional) you can click a task name to enter the task details page and view the task initialization status and monitoring data.

    Subsequent Operations

    After the data is synced to the target Kafka, the data can be consumed. We provide you with a c
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support