Parameters | Description |
Data Source | Select an available Elasticsearch data source from the current project. |
Index | Supports multiple index names or regular expressions. Use wildcards (*) for index name regular expressions, e.g., index_*. |
ES Version | Determine the ES version based on the data source and index. |
Split Key | Specify the field for data sharding. After specifying, concurrent tasks will be launched for data synchronization. You can use a column in the source data table as the partition key. It is recommended to use the primary key or indexed column as the partition key. |
Search Condition (Optional) | Use JSON format for search. |
Parameters | Description |
Data Destination | Select an available Elasticsearch data source from the current project. |
Index | Index Name in Elasticsearch. |
Dynamic Mapping | Definition: When an unknown field is found in the document, decide whether the synchronization task should use Elasticsearch's dynamic mapping mechanism to add a mapping for the field. Enable: Retain Elasticsearch's automatic mappings. Disable: Default is off. Generate and update Elasticsearch mappings according to the columns configured in the synchronization task. In Elasticsearch 7.x, the default type is _doc. When using Elasticsearch auto mappings, please configure _doc and set esVersion to 7. |
Clear original index data | Manually choose whether to clear the original index data: No: Retain the existing data in the index before importing new data. Yes: Delete the original index and rebuild a new index with the same name before importing new data. This action will remove the data under that index. |
Write mode | Support two write modes: insert and update: Insert: All data is directly inserted. Update: Update data if the same primary key exists; otherwise, insert. |
Primary Key Value Acquisition Method | Supports three value methods: Source table primary key: Use the primary key of the source table as the document's id. Composite primary key: Use multiple columns from the source table to determine the document's id. No primary key: Generate a default _id value. |
Batch Submission Size | Batch submission record size for one-time submissions: This value can significantly reduce the network interactions between the data synchronization system and Elasticsearch and improve overall throughput. If set too high, it may cause OOM exceptions in the data synchronization process. |
Advanced Settings (Optional) | You can configure parameters according to business needs. |
ElasticSearch data types | Internal Types |
byte,short,integer,long,unsigned long | Long |
float,double,half_float | Double |
string,text,keyword,integer_range,long_range,float_range,double_range,date_range,array,object,nested,flattened,geo_point,geo_shape | String |
date | Date |
binary | Bytes |
boolean | Boolean |
Internal Types | ElasticSearch data types |
Long | byte,short,integer,long |
Double | float,double |
String | string,text,keyword,object,nested,geo_point,geo_shape,ip,binary,completion |
Date | date |
Boolean | boolean |
PUT _template/merlion_suggest_words{"template": "merlion_suggest_words_*","order": 1,"mappings": {"properties": {"biz_owner": {"type": "keyword"},"create_time": {"type": "date","format": "yyyy-MM-dd HH:mm:ss||date_hour_minute_second||strict_date_optional_time||epoch_millis"},"is_deleted": {"type": "short"},"sug_words": {"type": "keyword"},"language": {"type": "keyword"}}},"settings": {"refresh_interval": "5s","number_of_replicas": 1,"number_of_shards": 1}}
Was this page helpful?