Parameter name | Parameter description |
Source Data Source Name | ${datasource_name_di_src} |
Source Database Name | ${db_name_di_src} |
Source Table Name | ${table_name_di_src} |
Source Schema Name | ${schema_name_di_src} Description: Applicable to data source types with a Schema attribute, such as PostgreSQL and Oracle. |
Source Topic Name | ${topic_di_src} Note:Applicable only to Kafka type. |
Data Fields | ${key} Note: Applicable only to Kafka type, please replace key with specific field names. |
Key Feature Points | Feature Overview | Feature Example | |
1 | Target Database/Table Name | The system will default to generating target database/table names based on predefined matching rules, while verifying whether the database/table exists in the target data source: Failed Matches for Databases/Tables: No database/table objects found in the target data source that match the matching rules. The list defaults to highlighting the databases/tables with failed matches. Matching successful database/table: There are database/table objects in the target data source that match the matching rules. | |
2 | Target table creation method | For database/table objects in the target data source, the system offers various table creation strategies: Failed Matches for Databases/Tables: Supports Create New Table and Do Not Create New methods Create New Table: In this batch creation, target tables are automatically created with DDL generated from the conversion. Do Not Create New: This table is temporarily ignored in this operation. Matching successful database/table: Supports Use Existing Table and Delete existing table and create a new one methods Use Existing Table: This table is temporarily ignored in this operation. Delete existing table and create a new one: In this operation, delete the existing table and recreate a table of the same name with DDL generated from the conversion. | |
3 | Preview/edit table statement | For the specified target table creation method, the system will automatically generate a DDL example. You can view and edit the create table statement: 1. Target table name is automatically generated based on the source table name and target table matching strategy. 2. Target table fields default to being consistent with the source table name. 3. For target data sources that have multiple data models (e.g., doris), the system will designate the appropriate model according to the default policy. You can manually modify the DDL statements to fit business characteristics. | |
Parameter | Description |
DDL Message | This parameter is mainly used to set how the DDL change messages captured from the source end during the task synchronization process are passed to the downstream and how the downstream responds to these messages. The target end provides the following response strategies for DDL messages: 1. Automatic Change: Under this strategy, for the messages captured from the source end, the target end will automatically follow the structural changes of the source end, including automatic table creation, automatic column addition, etc. 2. Ignore Changes: Under this strategy, the target end will ignore the DDL change messages and will not respond or notify any message. 3. Log Alerts: Under this strategy, the target end ignores DDL change messages, but the logs will include details of the DDL change messages. 4. Task Error: Under this strategy, if a DDL change occurs at the source end, the entire task will experience an error, continuously restart, and report errors. Note: Different source and target ends support varying DDL types and message processing. Please refer to the configuration strategy supported by different links. |
Metadata Writing | When the new table in the DDL strategy is set to Automatic Table Creation, you can choose whether to write metadata. Selecting the corresponding metadata fields will create the corresponding metadata fields in the target table and write the corresponding system metadata during synchronization. |
Write Exceptions | This parameter is used to set how the task handles exception data writing when data writing fails due to various reasons such as mismatch in table segment structure, field type mismatch, etc., and whether to interrupt the data flow. The overall write exception strategy includes: 1. Partial Stop: If some tables have write exceptions, only stop writing data for that table, other tables synchronize normally. The stopped tables cannot resume writing in this task run. 2. Abnormal restart: If some tables have write exceptions, all tables stop writing. Under this strategy, the task will continuously restart until all tables are synchronized normally, which may cause duplicate data writing for some tables during the restart period. 3. Ignore Exception: Ignore the exception data that cannot be written in the table and mark it as dirty data. Other data in that table, and other tables in the task, synchronize normally. Dirty data offers two schemes: COS archiving and Do Not Archive. |
Dirty Data | When the configuration for write exceptions is set to Ignore Exceptions, you can choose whether to archive the ignored data: 1. COS Archiving: Uniformly archive the exception data into a COS file, this method can prevent loss of exception data, facilitate subsequent analysis of reasons for exception writing, and data recovery. 2. Do Not Archive: The task completely ignores and discards the exception data. |
Parameter | Description |
Checkpoint Interval | The maximum checkpoint interval for the current task submission. |
Maximum Restart Attempts | Sets the maximum restart threshold for the task in case a fault occurs during execution. If the number of restarts during operation exceeds this threshold, the task status will be set to Failed. The setting range is [-1,100]. Where A threshold of 0 means no restart; -1 means no limit on the maximum number of restarts. |
Parameter | Set task-level running parameters. There are differences in task-level parameters supported by different sources and destinations. |
Serial number | Parameter | Description |
1 | Submit | Submit the current task to the production environment. When submitting, different running strategies can be chosen depending on whether there is a production task for the current task: If there are no effective online tasks for the current task, either because it's the first submission or the online task is in a "Failed" state, it can be submitted directly. If there are online tasks in a "Running" or "Suspension" state, different strategies must be chosen. Stopping an online job will discard the previous task runtime position and start consuming data from the beginning, while keeping the job status will continue running from the last consumed point after a restart. Note: Click Start Now to have the task run immediately after submission, otherwise, it needs to be manually triggered to run formally. |
2 | Lock/Unlock | By default, the creator is the first lock holder, allowing only the lock holder to edit task configurations and run tasks. If the lock holder does not make an edit operation within 5 minutes, others can click the icon to grab the lock, and successful lock grabbing allows for editing operations. |
3 | Go to Operations | Quickly jump to the Task Operation and Maintenance Page based on the current task name. |
4 | Save | After completing the preview, click the Save button to save the whole database task configuration. The task will not be submitted to the Operations Center if only saved |
Steps | Step Instructions | |
Task Configuration Detection | This step involves detection of the read end, write end, and resources within the task: Detection passed: Configuration Correct. Detection failed: Configuration issues exist, requiring repairs for subsequent configuration. Detection and Alert: This detection offers system recommended modifications. After modifications, you can click Retry for re-detection; or, you can click Ignore Exception to proceed to the Next step without blocking subsequent configuration. The currently supported detection items are listed in the subsequent table. | |
Submission Strategy Selection | In this step, you can choose the submission strategy for this task: First Submission: The initial submission supports synchronizing data from a default or specified point. Start immediately, syncing from the default position: If the source is configured for "full + incremental" read mode, it will first sync the existing data (full phase) by default, and then proceed to consume the binlog to obtain the changed data (incremental phase); if the source is set to "incremental only" read mode, it will start reading from the latest position in the binlog by default. Start immediately, syncing from a specified point in time: The task will sync the data according to the configured time and timezone. If the specified time point is not found, the task will default to sync from the earliest point in the binlog; if the source's read method is "full + incremental", the task will default to skipping the full phase and start syncing from the specified time point in the incremental phase. Not Starting Now: The task will not start immediately after submission, and can be manually started later from the Operations and Maintenance list. Not the First Submission: Supports starting or continuing tasks with a running status Continue Running: Under this policy, after a new version of the task is submitted, it will continue running from the last synchronization position. Restart from a specified location: With this strategy, you can specify the read start location. The task will ignore the old version and start reading from the specified location. If the specified time location is not found, the task will default to synchronizing from the earliest binlog location. Restart, running from the default position: Under this policy, the system will start reading from the default position according to the source configuration. If the source is configured for "full + incremental" read mode, it will first sync the existing data (full phase) by default, and then it can consume the binlog to get the changed data (incremental phase); if the source is set to "incremental only" read mode, it will start reading from the latest position in the binlog by default. The submission and execution strategies supported by different task statuses vary. See the subsequent table for details. Additionally, each submission will generate a new real-time task version, and you can configure the version description in the dialog. | |
Submitting the job | After a successful submission, you can click Go to Operations and Maintenance to check the task execution status. | |
Detection Classification | Check Items | Description |
Task Configuration Detection | Source Configuration | Check whether mandatory items in the source configuration are missing |
| Destination Configuration | Check whether mandatory items in the destination configuration are missing |
| Mapping Relationship Configuration | Check whether field mapping has been configured |
| Resource Group Configuration | Check whether the resource group is configured |
Data Source Detection | Source Connectivity Detection | Check whether the source data source and the task configuration resource group have network connectivity. If the detection fails, you can view the diagnostic information. After resolving the network issue, you can recheck. Otherwise, the task is likely to fail. |
| Destination Connectivity Detection | Check whether the destination data source and the task configuration resource group have network connectivity. If the detection fails, you can view the diagnostic information. After resolving the network issue, you can recheck. Otherwise, the task is likely to fail. |
Resource Detection | Resource Status Detection | Check whether the resource group is in an available status. If the resource status is unavailable, please replace the task configuration resource group. Otherwise, the task is likely to fail. |
| Resource Margin Detection | Check whether the current remaining resources in the resource group meet the task configuration resource requirements. If the detection fails, please appropriately reduce the task resource configuration or expand the resource group. |
Task Status | Submission Strategies | Description |
1. First Submission 2. Stopped/Detect Anomalies/Initialization (Not the First Submission) | Start Now, Synchronize from Default Position | Under this policy, reading will start from the default position based on the source configuration. If the source is configured to "full + incremental" reading, it will by default first synchronize the existing data (full phase), and after completion, consume binlog to obtain changed data (incremental phase); if the source is set to "incremental only", it will by default start reading from the latest position of binlog. |
| Start Now, Synchronize from Designated Time Point | Under this strategy, a specific start time must be selected to match the position by time. 1. Reads data from the designated time point. If the specific position is not matched, the task will default to synchronizing from the earliest binlog position 2. If your source reading method is Full + Incremental, selecting this strategy will skip the Full phase and start synchronizing from the designated incremental time point |
| Do Not Start Yet, Manually Start the Task Later in Real-Time Task Operation and Maintenance | Under this strategy, the task will only be submitted to real-time operation and maintenance without starting the task. Subsequent batch task starts can be performed from the real-time operation and maintenance page. |
Running (not the first submission) | Continue Running, Retain Job Status Data, Continue Running from Last Synchronization Position | Under this strategy, the new version of the task, once submitted, will continue running from the last synchronization position. |
| Restart from a specified time point and continue running | Under this strategy, you can specify the read start location. The task will ignore the old version and start reading from the specified location. If the designated time location is not found, the task will default to synchronizing from the earliest binlog location. |
| Restart and stop the running task, discarding its state, and start running from the default location | This policy will stop the currently running task and discard the task state, then start reading from the default point according to the source end configuration. If the source end is configured for "Full + Increment" reading mode, it will first synchronize the existing data (full stage). Once completed, it will consume binlog to get changed data (increment stage). If the source end is configured for "Increment Only" reading, it will start reading from the latest binlog point by default. |
Paused (not the first submission) | Continue Running, Retain Job Status Data, Continue Running from Last Synchronization Position | Under this strategy, the new version of the task, once submitted, will continue running from the last synchronization position. Note: Pausing operation will create a snapshot, and the task can be resubmitted to continue running from the last checkpoint. Forced Pause does not generate a snapshot. When the task is resubmitted, it will continue from the last snapshot taken during task operation. This type of pause may cause partial data replay. If the target write is Append, there will be duplicate data. If the target write is Upsert, there will be no duplication. |
| Restart from a specified time point and continue running | Under this strategy, you can specify the read start location. The task will ignore the old version and start reading from the specified location. If the designated time location is not found, the task will default to synchronizing from the earliest binlog location. |
| Restart and stop the running task, discarding its state, and start running from the default location | This policy will stop the currently running task and discard the task state, then start reading from the default point according to the source end configuration. If the source end is configured for "Full + Increment" reading mode, it will first synchronize the existing data (full stage). Once completed, it will consume binlog to get changed data (increment stage). If the source end is configured for "Increment Only" reading, it will start reading from the latest binlog point by default. |
Failed (not the first submission) | Resume operation from the last failed checkpoint | This policy will continue running from the point where the task last failed |
| Restart and run from the default point according to the task reading configuration | This policy will read from the default point according to the source end configuration. If the source end is configured for "Full + Increment" reading mode, it will first synchronize the existing data (full stage). Once completed, it will consume binlog to get changed data (increment stage). If the source end is configured for "Increment Only" reading, it will start reading from the latest binlog point by default. |
In Progress (not the first submission) | Not supported | When there is an online task with the same name and its status is in progress, resubmitting the task is not supported |
Was this page helpful?