View system built-in templates

Recent Pages

View system built-in templates

Last updated: 2024-11-01 15:58:21

The system has 56 built-in rule templates available for direct use. Please thoroughly understand the use cases of each template before using them.
View template list
In the Rule Template Management Page, you can view the list of system templates.
Users can filter and query based on template name, description keywords, type, dimension, and applicable engine. Meanwhile, users can create and manage templates in bulk in the custom template interface.
﻿
﻿
﻿
Field
Details
Template Type
Currently supports two types of templates: table-level and field-level, and supports filtering
Template Name
Template naming
Template Description
Detailed description of the specific execution logic and formulas of the template rules
Dimension
Accuracy, Uniqueness, Integrity, Consistency, Timeliness, Validity, support filtering
Applicable Engine
Engine types applicable to this template: currently supports Hive, Spark, DLC, TCHouse-D, and Doris types. Supports filtering
Reference Count
The number of rules currently associated with the template, supports filtering
Template distribution
Monitored Object
Rule Dimension
Compute Item
Calculation Sub-item
Description
Numeric Type
﻿
Numeric - Volatility Type
﻿
﻿
﻿
Numeric - Standard Score Type
﻿
Other
﻿
﻿
﻿
﻿
﻿
﻿
﻿
﻿
Fixed Value
Value Range
Previous Cycle
1 day ago
7 days ago
30 days ago
7 days
30 days
Empty/Unique/Duplicate
Format Matching
Enumerated range
Value size
Table-level
Accuracy
Number of table rows
﻿
Calculates the number of data rows
✅
-
✅
✅
✅
✅
✅
✅
-
-
-
-
﻿
﻿
Table size (bytes)
﻿
Calculates the size of the data table (supports only Hive tables)
✅
-
-
✅
✅
-
-
-
-
-
-
-
﻿
Timeliness
Timeliness of data output
﻿
Calculates the number of data rows. If the number of rows is 0, it is considered that no data is produced
✅ = 0
-
-
-
-
-
-
-
-
-
-
-
Field-level
Accuracy
Field value
Average value
Calculates the average value
✅
-
-
✅
✅
✅
✅
✅
-
-
-
-
﻿
﻿
﻿
Total value
Calculate the total value of numerical data
✅
-
-
✅
✅
✅
✅
✅
-
-
-
-
﻿
﻿
﻿
Median
Calculate the median of numerical data
✅
-
-
✅
✅
✅
✅
✅
-
-
-
-
﻿
﻿
﻿
Minimum value
Calculate the minimum value of numerical data
✅
-
-
✅
✅
✅
✅
✅
-
-
-
-
﻿
﻿
﻿
Maximum value
Calculate the maximum value of numerical data
✅
-
-
✅
✅
✅
✅
✅
-
-
-
-
﻿
Uniqueness
Field unique values
Number of unique values
Verify unique values
-
-
-
-
-
-
-
-
✅
-
-
-
﻿
﻿
﻿
Number of unique values/Total rows
﻿
-
-
-
-
-
-
-
-
✅
-
-
-
﻿
﻿
Field duplicate values
Number of duplicate values
Verify duplicate values
-
-
-
-
-
-
-
-
✅
-
-
-
﻿
﻿
﻿
Number of duplicate values/Total rows
﻿
-
-
-
-
-
-
-
-
✅
-
-
-
﻿
Integrity
Field null values
Number of null values
Validation controls
-
-
-
-
-
-
-
-
✅
-
-
-
﻿
﻿
﻿
Number of null values/Total rows
﻿
-
-
-
-
-
-
-
-
✅
-
-
-
﻿
Validity
Mobile number format
Number of invalid entries
Regular Expression Validation, conforms to Mainland China Mobile Phone Number Format
-
-
-
-
-
-
-
-
-
✅
-
-
﻿
﻿
﻿
Number of invalid entries/Total rows
﻿
-
-
-
-
-
-
-
-
-
✅
-
-
﻿
﻿
Email format
Number of invalid entries
Regular Expression Validation, conforms to Email Format
-
-
-
-
-
-
-
-
-
✅
-
-
﻿
﻿
﻿
Number of invalid entries/Total rows
﻿
-
-
-
-
-
-
-
-
-
✅
-
-
﻿
﻿
ID card format
Number of invalid entries
Regular Expression Validation, conforms to Chinese Mainland ID Card Format
-
-
-
-
-
-
-
-
-
✅
-
-
﻿
﻿
﻿
Number of invalid entries/Total rows
﻿
-
-
-
-
-
-
-
-
-
✅
-
-
﻿
Consistency
Field Data Range
Value Range
Check if the value is within the numeric range
-
✅
-
-
-
-
-
-
-
-
-
-
﻿
﻿
﻿
Enumerated range
Check if the character value is within enumerated values
-
-
-
-
-
-
-
-
-
-
✅
-
﻿
﻿
Field Data Correlation
﻿
Comparing a field against another database table
-
-
-
-
-
-
-
-
-
-
-
✅
Use Instructions
Terminology
﻿
Explanation
Monitored Object
Table-level
When the monitored object is table-level, you can monitor the number of table rows, table size, and timeliness of data output (equivalent to the number of table rows).
﻿
Field Level
When the monitored object is field-level, you can monitor the field's values (including average value, maximum value, minimum value, median, summary value), field value format (phone number, email, ID card number), and whether the field is empty.
Rule Dimension
-
The rule dimension is designed to calculate the quality score and reflect the quality proportion of different types of rules.
There are six built-in rule dimensions in the system: Accuracy, Uniqueness, Integrity, Consistency, Timeliness, and Validity.
Validation Method
Numeric Type
Mainly includes numerical comparison and numeric range comparison.
﻿
Volatility Type
Term Explanation:
The volatility type is used to reflect the fluctuation of values, that is, the rise or fall compared to a certain time point.
Calculation Formula:
Volatility = Current scan result / Scan result at a certain time point * 100%.
Note:
The calculation result of volatility is a percentage. When using the volatility template, the Partition must be specified.
Example 1: 7-day Cyclical Volatility
When the partition is specified, and the baseline value is the data from 7 days ago, if the calculation result is 100%,
it means that the current partition data has doubled compared to the partition data from 7 days ago.
Example 2: Previous Period Volatility:
When the partition is specified, and the baseline value is the last operation period, and the rule is associated with a production scheduling task (e.g., an offline development task), if the calculation result is 100%,
it indicates that the statistical data after the current offline development task has been completed has doubled compared to the statistical data after the previous operation was completed.
Example 3: Cyclical Volatility Rate + Default Period:
When setting quality rules using the cyclical volatility rate template and a default period is set, such as 7 days ago. If this rule is not associated with a production scheduling task, and the calculation result is 100%.
It means that the current partition data has doubled compared to the partition data from 7 days ago. That is, it compares the current data with the data from 7 days ago.
﻿
Standard Typing
(Variance Fluctuation)
Term Explanation:
The standard score is an important statistical concept, reflecting whether a certain value is within a credible range.
If the calculation result is too large or too small, it is highly likely an abnormal value.
Calculation Formula:
﻿
﻿
﻿
Note:
The calculation result of the standard score is a unitless decimal, indicating whether the data is abnormal within the dataset.
Generally, a standard score absolute value greater than 3 is considered an abnormal value, with a normal probability of only 0.28%
[-1,1]: Normal Probability: 68.26%
[-2,2]: Normal Probability: 95.44%
[-3,3]: Normal Probability: 99.72%
Not within [-3,3]: Normal Probability: 0.28%
﻿
Other
No restriction on value validation field type.
Null/Unique/Duplicate: Count or proportion of null values, unique values, and duplicate values;
Format Matching: Count or proportion of values not matching the format;
Enumeration Range: Count of values not within the enumeration range;
Note:
Fill in the expected value here. An alarm will be triggered when the field is out of range.
Field Relevance: Statistics on whether it is the same as the value of another database table field.
Comparative Relationship: Greater than, Less than, Equal to;
Target Data: Database table, field, filter criteria;
Associated Conditions: Associated fields of two tables.
Note:
The comparison table needs to correspond to the detection table data one-to-one.
﻿

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support

Terminology			Explanation
Monitored Object	Table-level	When the monitored object is table-level, you can monitor the number of table rows, table size, and timeliness of data output (equivalent to the number of table rows).
Monitored Object		Field Level	When the monitored object is field-level, you can monitor the field's values (including average value, maximum value, minimum value, median, summary value), field value format (phone number, email, ID card number), and whether the field is empty.
Rule Dimension	-	The rule dimension is designed to calculate the quality score and reflect the quality proportion of different types of rules. There are six built-in rule dimensions in the system: Accuracy, Uniqueness, Integrity, Consistency, Timeliness, and Validity.
Validation Method	Numeric Type	Mainly includes numerical comparison and numeric range comparison.
		Volatility Type	Term Explanation: The volatility type is used to reflect the fluctuation of values, that is, the rise or fall compared to a certain time point. Calculation Formula: Volatility = Current scan result / Scan result at a certain time point * 100%. Note: The calculation result of volatility is a percentage. When using the volatility template, the Partition must be specified. Example 1: 7-day Cyclical Volatility When the partition is specified, and the baseline value is the data from 7 days ago, if the calculation result is 100%, it means that the current partition data has doubled compared to the partition data from 7 days ago. Example 2: Previous Period Volatility: When the partition is specified, and the baseline value is the last operation period, and the rule is associated with a production scheduling task (e.g., an offline development task), if the calculation result is 100%, it indicates that the statistical data after the current offline development task has been completed has doubled compared to the statistical data after the previous operation was completed. Example 3: Cyclical Volatility Rate + Default Period: When setting quality rules using the cyclical volatility rate template and a default period is set, such as 7 days ago. If this rule is not associated with a production scheduling task, and the calculation result is 100%. It means that the current partition data has doubled compared to the partition data from 7 days ago. That is, it compares the current data with the data from 7 days ago.
		Standard Typing (Variance Fluctuation)	Term Explanation: The standard score is an important statistical concept, reflecting whether a certain value is within a credible range. If the calculation result is too large or too small, it is highly likely an abnormal value. Calculation Formula: Note: The calculation result of the standard score is a unitless decimal, indicating whether the data is abnormal within the dataset. Generally, a standard score absolute value greater than 3 is considered an abnormal value, with a normal probability of only 0.28% [-1,1]: Normal Probability: 68.26% [-2,2]: Normal Probability: 95.44% [-3,3]: Normal Probability: 99.72% Not within [-3,3]: Normal Probability: 0.28%
		Other	No restriction on value validation field type. Null/Unique/Duplicate: Count or proportion of null values, unique values, and duplicate values; Format Matching: Count or proportion of values not matching the format; Enumeration Range: Count of values not within the enumeration range; Note: Fill in the expected value here. An alarm will be triggered when the field is out of range. Field Relevance: Statistics on whether it is the same as the value of another database table field. Comparative Relationship: Greater than, Less than, Equal to; Target Data: Database table, field, filter criteria; Associated Conditions: Associated fields of two tables. Note: The comparison table needs to correspond to the detection table data one-to-one.

tencent cloud

Recent Pages

View system built-in templates

View template list

Template distribution

Use Instructions

Was this page helpful?

Was this page helpful?

Field	Details
Template Type	Currently supports two types of templates: table-level and field-level, and supports filtering
Template Name	Template naming
Template Description	Detailed description of the specific execution logic and formulas of the template rules
Dimension	Accuracy, Uniqueness, Integrity, Consistency, Timeliness, Validity, support filtering
Applicable Engine	Engine types applicable to this template: currently supports Hive, Spark, DLC, TCHouse-D, and Doris types. Supports filtering
Reference Count	The number of rules currently associated with the template, supports filtering

tencent cloud

Sign Up

Log in

Recent Pages

View system built-in templates

View template list

Template distribution

Use Instructions

Was this page helpful?

Was this page helpful?