tencent cloud

Advanced Parameters for Offline Node
Last updated: 2024-11-01 17:33:23
Advanced Parameters for Offline Node
Last updated: 2024-11-01 17:33:23

Parameter Description

Offline Type
Read/Write
Configuration contents
Applicable Scenario
Description
MySQL
Read
splitFactor=5
Single table
-
TDSQL MySQL
Read
splitFactor=5
Single table
-
Doris
Read
query_timeout=604800
Single table
Timeout for the query, in seconds
exec_mem_limit=4294967296
Single table
Set the execution memory limit to restrict memory usage during query execution
parallel_fragment_exec_instance_num=8
Single table
Specify the number of instances for executing parallel fragments
Hive
Read
mapreduce.job.queuename=root.default
Single table
Specify which queue the job should be submitted to
hive.execution.engine=mr
Single table
A configuration parameter in Hive to specify the execution engine for Hive queries. Currently, the default is mr, no need to modify.
Note:
When setting advanced parameters, the mapreduce.job.queuename=root.default and hive.execution.engine=mr parameters must be used together. A single parameter won't take effect.
DLC
Read
fs.cosn.trsf.fs.ofs.data.transfer.thread.count=8
Single table
DLC concurrent writing, supports parameters none|hash|range Parameter description:
1. none: If a primary key exists, concurrent writing is based on the primary key; otherwise, single-threaded writing is used
2. ash: If there are partition fields, write concurrently based on partition fields. Otherwise, write based on the 'none' strategy
3. range: Not supported yet, the strategy is the same as 'none'
fs.cosn.trsf.fs.ofs.prev.read.block.count=4
Single table
Enable small file merge in DLC, which can also be enabled in the DLC console. The parameter defaults to false. The entire database sync interface has an option to enable it. Single-table sync requires manual configuration of this parameter
Mongodb
Read
batchSize=1000
Single table
Number of records for batch reading
COS
Write
splitFileSize=134217728
Single table
Single file split size
Not effective for Hive on COS
Supports text, orc, and parquet file types
hadoopConfig={}
Single table
Supports adding configurations to hadoopConfig
HDFS
Write
splitFileSize=134217728
Single table
Single file split size
Hive on HDFS not effective
Supports text, orc, and parquet file types
Hive
Write
compress=none/snappy/lz4/bzip2/gzip/deflate
Single table
Default is none. This is valid only for textfile format, and not for orc/parquet (orc/parquet requires specifying compression in the create table statement)
format=orc/parquet
Single table
The format of HDFS temporary files, default is orc, irrelevant to the final Hive table format
partition=static
Single table
Static partitioning mode. Suitable for single partition writing, saves more memory
Doris
Write
sameNameWildcardColumn=true
Single table
MySQL-Doris configuration* supports field mapping with same names
Write
loadProps={"format":"csv","column_separator":"\\\\x01","row_delimiter":"\\\\x03"}
Single table
CSV format writing. Compared to default JSON format writing, it offers higher performance. Needs to be used together with row delimiter \\\\x03.
DLC
Write
fs.cosn.trsf.fs.ofs.data.transfer.thread.count=8
Single table
DLC concurrent writing, supports parameters none|hash|range Parameter description:
1. none: If a primary key exists, concurrent writing is based on the primary key; otherwise, single-threaded writing is used
2. hash: If a partition field exists, concurrent writing is based on the partition field; otherwise, writing follows the strategy of parameter 'none'
3. range: Not supported yet, the strategy is the same as 'none'
fs.cosn.trsf.fs.ofs.prev.read.block.count=4
Single table
Enable small file merge in DLC, which can also be enabled in the DLC console. The parameter defaults to false. The entire database sync interface has an option to enable it. Single-table sync requires manual configuration of this parameter
Mongodb
Write
replaceKey=id
Single table
When the writing mode is Overwrite, it's used as the business primary key for the update
batchSize=2000
Single table
Batch write count, if not set, defaults to 1000
Elasticsearch
Write
compression=true
Single table
HTTP requests, enable Compression
multiThread=true
Single table
HTTP requests, whether Multi-threading is used
ignoreWriterError
Single table
Ignore write errors, no retries, continue writing
ignoreParseError=false
Single table
Ignore data format parsing errors, continue writing
alias
Single table
Elasticsearch alias is similar to the view mechanism in databases. Create an alias my_index_alias for the index my_index. Operations on my_index_alias will be consistent with those on my_index. Configuring an alias means creating an alias for the specified index after data import is complete.
aliasMode=append
Single table
Alias mode after data import completion includes append (add mode) and exclusive (keep only this one). Append adds the current index to the alias mapping (one alias corresponds to multiple indices), exclusive deletes the alias first, then adds the current index to the alias mapping (one alias corresponds to one index). Subsequently, the alias will be converted to the actual index name. The alias can be used for index migration and unified query of multiple indices, and can be used to implement the view feature.
nullToDate=null
Single table
Convert null values to date type, fill null
Kafka
Write
kafkaConfig={}
Single table
Supports Kafka Producer configuration options
Metadata Field
Read/Write
Configuration contents
Kafka
Read
__key__ Indicates the message key
__value__ Indicates the complete content of the message
__partition__ Indicates the partition where the current message is located
__headers__ Indicates the headers information of the current message
__offset__ Indicates the offset of the current message
__timestamp__ Indicates the timestamp of the current message
Elasticsearch
Read
_id Supports obtaining _id information

Configuration Method





Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback