Data Processing
Data processing refers to the process of filtering, cleaning, desensitizing, enriching, and distributing log data to the target log topic. It can be understood as log ETL (Extract-Transform-Load).
Source Log Topic: Input of data processing task.
Target Log Topic: Output of data processing task.
Target name: Custom target topic name, which improves the readability of the target topic (business attribute) and is used in the function log_output("alias") when outputting logs to the specified target topic. Data processing tasks must have an output target topic, otherwise, the task cannot be created. DSL Processing Functions: DSL (Domain Specific Language) is a log data processing function developed by CLS for the requirements of log ETL. The functions are simple and easy to use, with high processing performance. The underlying layer is implemented based on Flink, which can process logs in real time.
Timed SQL Analysis
Timed SQL Analysis refers to the process of periodically querying log data (supporting retrieval and SQL) based on the specified time window and saving the query results to the target log topic.
Source Log Topic: Input of scheduled SQL task.
Target Log Topic: Output of timed SQL task.
Scheduling Range: Time range of logs that supports query, for example, the log data from 00:00:00 on January 1, 2023 to 00:00:00 on March 31, 2023.
Scheduling Period: Periodic query, with the value ranging from 1 to 1,440 minutes. If daily reports need to be generated, it can be configured as 1,440 minutes.
SQL Time Window: Time window for specifying query statements. When it is used with the scheduling cycle, you can get tumbling window and sliding window.
Rolling window: Non-overlapping query window. For example, if the scheduling period is 60 minutes, the SQL time window is 60 minutes. Typical scenario: hourly report.
Sliding Window: Query windows with overlapping. For example, the scheduling cycle is 1 minute and the SQL time window is 60 minutes. Typical scene: Create a sequence diagram for active users within 1 hour, with a time axis granularity of 1 minute.
Delayed Execution: For query of the delayed time in the Advanced Settings of the console, with the value ranging from 60 to 120 seconds. There is usually a delay when the log index is generated, and the log cannot be queried before the index is generated. Therefore, set a delay of 60 seconds for query, and the index is already generated when the time elapses (99.9% of the index data will be generated within 5 seconds).
Apakah halaman ini membantu?