WeData provides users with the feature to customize metadata collection tasks. Administrators must perform metadata collection on data sources before visualization management can be carried out. The collection granularity supports up to the database level, and each database can only create one collection task. Collection tasks will run and update metadata information according to the configured schedule, while also supporting manual operation, task editing, and other management operations.
Supported Data Source Types
The currently supported data source types for metadata collection are as follows:
|
Big data | Hive |
| HBase |
| DLC |
| ClickHouse |
| TCHouse-C |
| Iceberg |
| Greenplum |
| Doris |
| StarRocks |
| TCHouse-D |
Relational Database | MySQL |
| Tencent Cloud MySQL |
| PostgreSQL |
| Oracle |
| SQL Server |
| TCHouse-P |
| TDSQL-PostgreSQL |
Collection Task Settings
Creating collection task
1. Go to Data Discovery > Metadata Collection page, click Create Collection Task.
2. Enter the Create Collection Task interface and select Hive as the data source type.
3. Go to Set Collection Object page, fill in the following parameter information, and after completion, click Next.
Note:
Each collection task can be bound to at most one data source under the WeData project, and the data source cannot be bound to multiple collection tasks repeatedly.
|
Task Name | The name of the collection task cannot be empty. Naming can start with letters or Chinese characters and can include letters, Chinese characters, digits, minus signs (-), and underscores (_). |
Description(Optional) | Optional. Description of the collection task. |
Affiliated Project | Specify the associated project for the data source and bind the data source management permission to the project. |
Data Source | The data source name corresponding to the collection task, which can be viewed in the project management module. |
Database | A database can only correspond to one collection task, and databases that have been collected cannot be selected. |
Data Table | Data tables to be collected |
Designated Table Owner | Owner of data table management permission |
Task Owner | The responsible person has the authority to view, stop, start, view logs, view details, rerun, and modify task information. |
Configure Collection Plan
Configure the period, specific date, and time for running the metadata collection task.
Collection Cycle: The current version supports hourly, daily, weekly, monthly, and one-time collection tasks.
Collection Date: For weekly and monthly tasks, you can specify one or more specific collection dates. After configuration, the task will run as planned on those days. For example, if you set the collection dates to the 1st, 5th, and 31st, the task will execute the metadata collection task on the 1st, 5th, and 31st of each month.
Execution Time: The specific time when the task is executed.
Run Now: After setting, the collection task will trigger a collection immediately once the task configuration is completed.
Collection task list
The Collection task list provides information on all collection tasks under the current user, including task name, collection object, technical type, project, creator, etc. It offers operations such as viewing collection task details, logs, editing, deleting, and transferring.
|
Task Name | Name of the Collection Task |
Type | Supported Data Source Type |
Data Source | Data source of the collection task |
Capture Database | Collection Database of Collection Task |
Task Owner | The account name of the current task owner |
Created by | The account name that creates the collection task |
Creation Time | Time of creating collection task |
Collection Plan | Runtime cycle of collection task |
Operating status | Running state of the collection task |
Recent execution time | The date (YYYY-MM-DD) and moment information of the last run of the collection task |
Operation | Provide the features of viewing, editing, deleting task details, and viewing task details. |
Run a collection task
Manually run a one-time/periodic task. Tasks that are not in the running status can be manually run.
Editing collection tasks
Projects, data sources, update methods, deletion methods, and collection plans can be edited when not in execution state; support same type data source modification, and collection tasks are collected based on the latest bound collection object.
Deletion of collection task
After the collection task is deleted, the collection for that data source will stop. Tables already collected to WeData in the current version will not be deleted.
Transfer a collection task
In the collection task list, a collection task can be transferred to another responsible person, and the original responsible person no longer has management permission for that task.