streaming_load_max_mb.
streaming_load_max_mb
to 16000.Parameter | Description |
Data Source | Available Doris data source to be synchronized. |
Database | Supports selection or manual input of the library name to read from. By default, the database bound to the data source is used as the default database. Other databases need to be manually entered. If the data source network is not connected and the database information cannot be fetched directly, you can manually enter the database name. Data synchronization can still be performed when the Data Integration network is connected. |
Table | Supports selecting or manually entering the table name to be read. |
Split Key | Specify the field for data sharding. After specifying, concurrent tasks will be launched for data synchronization. You can use a column in the source data table as the partition key. It is recommended to use the primary key or indexed column as the partition key. |
Filter Conditions (Optional) | In actual business scenarios, it is common to select the data of the current day for synchronization, setting the WHERE condition to gmt_create>$bizdate. WHERE condition can effectively perform business incremental synchronization. If the WHERE statement is not filled, including not providing the key or value of WHERE, data synchronization will be regarded as synchronizing full data. |
Advanced Settings (Optional) | You can configure parameters according to business needs. |
Parameter | Description |
Data Destination | Doris data source to be written into. |
Database | Support selection or manual entry of the library name to be written to. By default, the database bound to the data source is used as the default database. Other databases need to be manually entered. If the data source network is not connected and the database information cannot be fetched directly, you can manually enter the database name. Data synchronization can still be performed when the Data Integration network is connected. |
Table | Support selection or manual entry of the table name to be written to. If the data source network is not connected and the table information cannot be fetched directly, you can manually enter the table name. Data synchronization can still be performed when the Data Integration network is connected. |
Table Overwriting | When enabled, Doris will support atomic overwrite operations at the table level. Before writing data, a new table with the same structure will be created using the CREATE TABLE LIKE statement. The new data will be imported into the new table and the old table will be atomically replaced via swap, achieving table overwrite. |
Maximum Number of Rows to Submit Each Time | Record size for one-time batch submission. |
Maximum Bytes per Submission | Maximum data volume for one-time batch submission. |
Line Separator(Optional) | The key delimiter for Doris write operations, default is '\\n'. Supports manual input. You must ensure it is consistent with the field delimiter of the created Doris table, otherwise data cannot be found in the Doris table. |
Pre-Executed SQL | The SQL statement executed before the synchronization task. Fill in the correct SQL syntax according to the data source type, such as clearing the old data in the table before execution (truncate table tablename). |
Post-Executed SQL | The SQL statement executed after the synchronization task. Fill in the correct SQL syntax according to the data source type, such as adding a timestamp (alter table tablename add colname timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP). |
Advanced Settings | You can configure parameters according to business needs. |
Doris data type | Internal Types |
TINYINT,SMALLINT,INT,BIGINT | Long |
FLOAT,DOUBULE,DECIMAL | Double |
VARCHAR,CHAR,ARRAY,STRUCT,STRING | String |
DATE,DATETIME | Date |
BOOLEAN | Boolean |
Internal Types | Doris data type |
Long | TINYINT,SMALLINT,INT,BIGINT |
Double | DOUBLE,FLOAT,DECIMAL |
String | STRING,VARCHAR,CHAR,ARRAY,STRUCT |
Date | DATETIME,DATE |
Boolean | BOOLEAN |
The table 'tbl1' has a partition column 'k1' of type DATE. Create a dynamic partition rule to partition by day, keeping only the last 7 days of partitions and pre-creating partitions for the next 3 days.CREATE TABLE tbl1(k1 DATE,...)PARTITION BY RANGE(k1) ()DISTRIBUTED BY HASH(k1)PROPERTIES("dynamic_partition.enable" = "true","dynamic_partition.time_unit" = "DAY","dynamic_partition.start" = "-7","dynamic_partition.end" = "3","dynamic_partition.prefix" = "p","dynamic_partition.buckets" = "32");Assuming the current date is 2020-05-29, according to the above rule, 'tbl1' will have the following partitions:p20200529: ["2020-05-29", "2020-05-30")p20200530: ["2020-05-30", "2020-05-31")p20200531: ["2020-05-31", "2020-06-01")p20200601: ["2020-06-01", "2020-06-02")On the next day, 2020-05-30, a new partition 'p20200602' will be created: ["2020-06-02", "2020-06-03")On 2020-06-06, because dynamic_partition.start is set to 7, the partition from 7 days prior will be deleted, i.e., partition 'p20200529' will be deleted.
show tablet 28750963;
SHOW PROC '/dbs/40637/16934967/partitions/28750944/16934968/28750963';
Reason: column_name[uuid], the length of input is too long than schema. first 32 bytes of input str: [0000000000000BB4E595527BE******] schema length: 200; actual length: 232; . src line [];
Was this page helpful?