Real-time Node Advanced Parameters

Parameter description
Data Source Type
Parameter Level
Read/Write
Applicable Scenario
Configuration
Description
Mysql / tdsql-c Mysql
Node Level
Read
Single Table + Whole Database
scan.newly-added-table.enabled=true
Parameter Description:
By setting this parameter, new tables can be detected after pausing and continuing. The default is false
1. When using this parameter for full incremental synchronization, the new tables will read the existing data first, and then read the incremental data
2. When using this parameter for incremental synchronization, the new tables will only read the incremental data
﻿
﻿
Read
Single Table + Whole Database
scan.incremental.snapshot.chunk.size=20000
Parameter Description:
For tasks with uniform data distribution, this parameter represents the approximate number of entries within a chunk. You can estimate the number of chunks by dividing the total data volume by the chunk size. The number of chunks affects whether the job manager (jobmanager) runs out of memory (OOM). Currently, with 2CU, it can support more than 100,000 chunks. If the data volume is too large, we can increase the chunk size to reduce the number of chunks
Notes:
For large volume data tasks (e.g., total data volume over 100 million, single record larger than 0.1M), it is generally recommended to set this to 20,000
﻿
﻿
Read
Single Table + Whole Database
split-key.even-distribution.factor.upper-bound=10.0d
Parameter Description:
During the MySQL existing data read phase, if the data is relatively sparse and the maximum value of the primary key field is extremely large, you can modify this parameter to use unequal segmentation, reducing the problem of too many chunks due to the extremely large primary key value, and thus preventing the job manager from running out of memory (OOM)
Notes:
The default value is 10.0d, typically no need to modify it
﻿
﻿
Read
Single Table + Whole Database
debezium.query.fetch.size=0
Parameter Description:
Represents the number of records fetched from the database each time. The default is 0, representing the jdbc default fetch size
Notes:
1. For large tasks (e.g., total data volume over 100 million, single record larger than 0.1M), it is recommended to fetch 1024 records if there's only one read instance
2. If the task has multiple read instances, it is recommended to lower this value to reduce memory consumption, suggested value is 512 records
﻿
﻿
Read
Single Table + Whole Database
debezium.max.queue.size=8192
Parameter Description:
The Definition attribute sets the maximum number of events stored in the internal queue. If this limit is reached, Debezium will pause reading new events until the pending events are processed and committed. This attribute helps prevent too many events from piling up in the queue, causing memory exhaustion and performance degradation. The default is 8192
Notes:
1. For large tasks (e.g., total data volume over 100 million, single record larger than 0.1M), if there's only one read instance, it is recommended to fetch 4096 records
2. If the task has multiple read instances, it is recommended to lower this value to reduce memory consumption, suggested value is 1024 records
﻿
Job level
-
-
taskmanager.memory.managed.fraction=0.1
Parameter Description:
Adjust the flink program taskmanager managed memory ratio
﻿
﻿
-
-
table.exec.sink.upsert-materialize=NONE
Parameter Description:
Due to the out-of-order Changelog data caused by shuffle in a distributed system, the data received by the sink may be out of order in the global upsert, so an upsert materialization operator should be added before the upsert sink. This operator receives upstream changelog data and generates an upsert view for downstream. This parameter is used to control the addition of the materialization operator
Notes:
1. By default, this materialization operator will be added when a unique key encounters distributed disorder, but you can choose not to materialize (NONE) or force materialization (FORCE)
2. Optional values are: NONE, AUTO, FORCE
﻿
﻿
-
-
table.exec.sink.not-null-enforcer=DROP
Parameter Description:
Decides how the task handles null values when a NOT NULL field encounters a null value
Suggested Value and Function:
1. ERROR: Throws a runtime exception when a NOT NULL field encounters a null value.
2. DROP: Discards data directly when a NOT NULL field encounters a null value
dlc
Node Level
Write
Single Table + Whole Database
write.distribution-mode=hash
Parameter Description:
DLC concurrent writing supports parameters none|hash (default)|range:
1. none: If a primary key exists, concurrent writing is based on the primary key; otherwise, single-threaded writing is used
2. hash: If a partition field exists, concurrent writing is based on the partition field; otherwise, writing follows the strategy of parameter 'none'
3. range: Not supported yet, the strategy is the same as 'none'
doris
Node Level
Write
Single Table Only
sink.properties.*=xxx
Parameter Description:
Stream Load Import Parameters. 
For example, 'sink.properties.column_separator' = ', ' 
Detailed configuration reference:Flink Doris Connector
﻿
Node Level
Write
Single Table Only
sink.properties.columns=xxx
Parameter Description:
Configure the Function Mapping Relationship of columns.
For example, 'sink.properties.columns' = 'dt,page,user_id,user_id=to_bitmap(user_id)'
Reference:Doris Stream Load
﻿
Node Level
Write
Single Table + Whole Database
sink.batch.size = 100000
sink.batch.bytes= 83886080
sink.batch.interval= 10s
Parameter Description:
Improve writing efficiency to Doris
Notes:
It is recommended to set tm cu to 2CU to avoid tm OOM
oracle
Node Level
Read
Single Table + Whole Database
'debezium.log.mining.strategy' = 'online_catalog'
'debezium.log.mining.continuous.mine' = 'true'
Parameter Description:
Enabling this parameter can reduce data synchronization delay and decrease redo log storage. Suitable for single table synchronization + whole database synchronization (specified table)
Notes:
1. After setting, new tables cannot be detected. If you configure to synchronize all database tables / specified databases, new table data cannot be read
2. Not applicable to Oracle19 versions that do not support this parameter. Therefore, it needs to be set to false, otherwise it will cause task failure.
﻿
Node Level
Read
Single Table + Whole Database
debezium.lob.enabled=false
Parameter Description:
Whether to synchronize blob type data, default is false
Notes:
1. If set to true, it may affect synchronous performance
2. The default recommended configuration for Oracle is false
mongodb
Node Level
Read
Single Table Only
scan.incremental.snapshot.enabled=true
Parameter Description:
Enable concurrent reads; the default is false
Notes:
Supported only in MongoDB version 4.0 and above
﻿
Node Level
Read
Single Table Only
copy.existing=false
Parameter Description:
Copy existing data from the source collection:
1. The default is true, meaning data is read from full volume
2. False means data is read from the increment
﻿
Node Level
Read
Single Table Only
poll.await.time.ms
Parameter Description:
Change event pull interval, default is 1500ms
Notes:
1. For collections with frequent changes, the pull interval can be reduced to improve processing timeliness
2. For collections with slow changes, the pull interval can be increased to reduce database pressure
﻿
Node Level
Read
Single Table Only
poll.max.batch.size
Parameter Description:
Maximum number of change events pulled per batch, default is 1000
Notes:
Increasing this parameter will speed up pulling change events from the cursor but will increase memory overhead
﻿
Node Level
Read
Single Table Only
scan.incremental.snapshot.chunk.size.mb
Parameter Description:
The chunk size for incremental snapshots is 64mb by default, unit is mb
﻿
Node Level
Read
Single Table Only
changelog.normalize.enabled
Parameter Description:
Enable changelogNormalize operator; the default is true, meaning it's enabled
Notes:
MongoDB lacks -u message, enabling this operator will supplement the -u message but will consume some performance. Disabling this operator will improve transfer speed, but delete operations cannot be synchronized downstream. Other operations are not affected
Node Level Configuration
﻿
﻿
﻿
Task Level Configuration
﻿
﻿
﻿
Note:
1. One parameter per line; if parameters need to be used together, write them on the same line.
2. Each parameter has a default value.
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

Data Source Type	Parameter Level	Read/Write	Applicable Scenario	Configuration	Description
Mysql / tdsql-c Mysql	Node Level	Read	Single Table + Whole Database	scan.newly-added-table.enabled=true	Parameter Description: By setting this parameter, new tables can be detected after pausing and continuing. The default is false 1. When using this parameter for full incremental synchronization, the new tables will read the existing data first, and then read the incremental data 2. When using this parameter for incremental synchronization, the new tables will only read the incremental data
				Read	Single Table + Whole Database	scan.incremental.snapshot.chunk.size=20000	Parameter Description: For tasks with uniform data distribution, this parameter represents the approximate number of entries within a chunk. You can estimate the number of chunks by dividing the total data volume by the chunk size. The number of chunks affects whether the job manager (jobmanager) runs out of memory (OOM). Currently, with 2CU, it can support more than 100,000 chunks. If the data volume is too large, we can increase the chunk size to reduce the number of chunks Notes: For large volume data tasks (e.g., total data volume over 100 million, single record larger than 0.1M), it is generally recommended to set this to 20,000
				Read	Single Table + Whole Database	split-key.even-distribution.factor.upper-bound=10.0d	Parameter Description: During the MySQL existing data read phase, if the data is relatively sparse and the maximum value of the primary key field is extremely large, you can modify this parameter to use unequal segmentation, reducing the problem of too many chunks due to the extremely large primary key value, and thus preventing the job manager from running out of memory (OOM) Notes: The default value is 10.0d, typically no need to modify it
				Read	Single Table + Whole Database	debezium.query.fetch.size=0	Parameter Description: Represents the number of records fetched from the database each time. The default is 0, representing the jdbc default fetch size Notes: 1. For large tasks (e.g., total data volume over 100 million, single record larger than 0.1M), it is recommended to fetch 1024 records if there's only one read instance 2. If the task has multiple read instances, it is recommended to lower this value to reduce memory consumption, suggested value is 512 records
				Read	Single Table + Whole Database	debezium.max.queue.size=8192	Parameter Description: The Definition attribute sets the maximum number of events stored in the internal queue. If this limit is reached, Debezium will pause reading new events until the pending events are processed and committed. This attribute helps prevent too many events from piling up in the queue, causing memory exhaustion and performance degradation. The default is 8192 Notes: 1. For large tasks (e.g., total data volume over 100 million, single record larger than 0.1M), if there's only one read instance, it is recommended to fetch 4096 records 2. If the task has multiple read instances, it is recommended to lower this value to reduce memory consumption, suggested value is 1024 records
		Job level	-	-	taskmanager.memory.managed.fraction=0.1	Parameter Description: Adjust the flink program taskmanager managed memory ratio
				-	-	table.exec.sink.upsert-materialize=NONE	Parameter Description: Due to the out-of-order Changelog data caused by shuffle in a distributed system, the data received by the sink may be out of order in the global upsert, so an upsert materialization operator should be added before the upsert sink. This operator receives upstream changelog data and generates an upsert view for downstream. This parameter is used to control the addition of the materialization operator Notes: 1. By default, this materialization operator will be added when a unique key encounters distributed disorder, but you can choose not to materialize (NONE) or force materialization (FORCE) 2. Optional values are: NONE, AUTO, FORCE
				-	-	table.exec.sink.not-null-enforcer=DROP	Parameter Description: Decides how the task handles null values when a NOT NULL field encounters a null value Suggested Value and Function: 1. ERROR: Throws a runtime exception when a NOT NULL field encounters a null value. 2. DROP: Discards data directly when a NOT NULL field encounters a null value
dlc	Node Level	Write	Single Table + Whole Database	write.distribution-mode=hash	Parameter Description: DLC concurrent writing supports parameters none\|hash (default)\|range: 1. none: If a primary key exists, concurrent writing is based on the primary key; otherwise, single-threaded writing is used 2. hash: If a partition field exists, concurrent writing is based on the partition field; otherwise, writing follows the strategy of parameter 'none' 3. range: Not supported yet, the strategy is the same as 'none'
doris	Node Level	Write	Single Table Only	sink.properties.*=xxx	Parameter Description: Stream Load Import Parameters. For example, 'sink.properties.column_separator' = ', ' Detailed configuration reference:Flink Doris Connector
		Node Level	Write	Single Table Only	sink.properties.columns=xxx	Parameter Description: Configure the Function Mapping Relationship of columns. For example, 'sink.properties.columns' = 'dt,page,user_id,user_id=to_bitmap(user_id)' Reference:Doris Stream Load
		Node Level	Write	Single Table + Whole Database	sink.batch.size = 100000 sink.batch.bytes= 83886080 sink.batch.interval= 10s	Parameter Description: Improve writing efficiency to Doris Notes: It is recommended to set tm cu to 2CU to avoid tm OOM
oracle	Node Level	Read	Single Table + Whole Database	'debezium.log.mining.strategy' = 'online_catalog' 'debezium.log.mining.continuous.mine' = 'true'	Parameter Description: Enabling this parameter can reduce data synchronization delay and decrease redo log storage. Suitable for single table synchronization + whole database synchronization (specified table) Notes: 1. After setting, new tables cannot be detected. If you configure to synchronize all database tables / specified databases, new table data cannot be read 2. Not applicable to Oracle19 versions that do not support this parameter. Therefore, it needs to be set to false, otherwise it will cause task failure.
oracle		Node Level	Read	Single Table + Whole Database	debezium.lob.enabled=false	Parameter Description: Whether to synchronize blob type data, default is false Notes: 1. If set to true, it may affect synchronous performance 2. The default recommended configuration for Oracle is false
mongodb	Node Level	Read	Single Table Only	scan.incremental.snapshot.enabled=true	Parameter Description: Enable concurrent reads; the default is false Notes: Supported only in MongoDB version 4.0 and above
		Node Level	Read	Single Table Only	copy.existing=false	Parameter Description: Copy existing data from the source collection: 1. The default is true, meaning data is read from full volume 2. False means data is read from the increment
		Node Level	Read	Single Table Only	poll.await.time.ms	Parameter Description: Change event pull interval, default is 1500ms Notes: 1. For collections with frequent changes, the pull interval can be reduced to improve processing timeliness 2. For collections with slow changes, the pull interval can be increased to reduce database pressure
		Node Level	Read	Single Table Only	poll.max.batch.size	Parameter Description: Maximum number of change events pulled per batch, default is 1000 Notes: Increasing this parameter will speed up pulling change events from the cursor but will increase memory overhead
		Node Level	Read	Single Table Only	scan.incremental.snapshot.chunk.size.mb	Parameter Description: The chunk size for incremental snapshots is 64mb by default, unit is mb
		Node Level	Read	Single Table Only	changelog.normalize.enabled	Parameter Description: Enable changelogNormalize operator; the default is true, meaning it's enabled Notes: MongoDB lacks -u message, enabling this operator will supplement the -u message but will consume some performance. Disabling this operator will improve transfer speed, but delete operations cannot be synchronized downstream. Other operations are not affected

tencent cloud

Sign Up

Log in

Compute

Microservice

Data Migration

Database SaaS Tool

Data Security

Application Security

Big Data

Voice Technology

Internet of Things

Stream Services

Cloud Real-time Rendering

Cloud Resource Management

More

Edge Computing

Serverless

Relational Database

Networking

Business Security

Domains & Websites

Face Recognition

AI Platform Service

Middleware

Media On-Demand

Game Services

Management and Audit Tools

Container

Essential Storage Service

Enterprise Distributed DBMS

CDN and Acceleration

Security Services

Enterprise Applications

Image Creation

Natural Language Processing

Communication

Media Process Services

Education Sevices

Developer Tools

Distributed cloud

Data Process and Analysis

NoSQL Database

Network Security

Cloud Security

Office Collaboration

Tencent Big Model

Optical Character Recognition

Interactive Video Services

Media SDK

Medical Services

Monitor and Operation

Parameter description

Node Level Configuration

Task Level Configuration