The core features of Tencent Cloud WeData include the following:
Project Management
Implement project isolation from the system/tenant perspective, allowing managers to control permissions for users (members) using Tencent Cloud WeData, configure the underlying computation engine, and manage execution resources.
Data Planning
Provides capabilities for overall data planning and design, including data warehouse layered categorization, logical model design, metric dimension definition, and data standards, helping enterprises unify data warehouse standard design and standard definitions, and automating the transition from the design phase to the development phase.
Data Warehouse Standards: Data warehouse planning is based on a global approach to business object unified planning and standard definitions, layering model management, classifying and managing different domains according to specific business themes, and forming hierarchical business tags.
Model Design: Includes definition and entity relationship design for logical models, encompassing definition, copy, modification, deletion, import/export, and version management capabilities while establishing associations with physical models, metric dimension association mapping, achieving automatic synchronization of the model from the design phase to the development phase.
Standard Management: Includes standard content management and benchmarking task management. Through the design of standard rules and task configuration, standardization of data values, libraries, table structures, table names, and metric dimension tags is achieved.
Business Definition: Metric/Dimension dictionary manages base/derived metrics and dimension conditions (normal dimensions, business limited, time cycles, degenerate dimensions) throughout their lifecycle, and establishes associations with models, achieving automatic generation of metric production code.
Data Integration
Lightweight operations, visualized processes, and open capabilities of data integration support high-speed and stable massive data synchronization between rich heterogeneous data sources in complex network environments.
All-Scenario Synchronization: Includes real-time and offline synchronization.
Multiple Types of Heterogeneous Data Sources: Supports 30+ data sources, providing star structure support for read-write combinations.
T-Transformation
Data Level: Content transformations during synchronization, such as data filtering, Join, etc.
Field Level: Offers single field transformation processing, including custom data fields, format conversion, time format conversion, etc.
Task and Data Monitoring
Read and Write Metrics: Supports real-time statistics on task read and write metrics, including total read and write volume, speed, throughput, dirty data, etc.
Monitoring and Alarms: Supports task and resource monitoring, covering SMS, email, HTTP, and other multi-channel alarms.
Data Development
Through stringent CI/CD process standards and the enhancement of automated test, release, and operation and maintenance capabilities, it shortens the path from raw data processing operations to business application data, improving efficiency while ensuring data quality.
Online Code Development: Supports code development, facilitating easy drag-and-drop orchestration of task workflows, as well as supporting the visual orchestration presentation of large-scale tasks.
Code Development: Supports online code development, debugging, and version management for tasks including HiveSQL, SparkSQL, JDBCSQL, Spark, Shell, MapReduce, PySpark, Python, TBase, DLC SQL, DLCSpark, CDW PostgreSQL, Impala, etc.
Task Testing: Supports testing and version management for tasks and workflows.
Development Assistance: Offers parameter configuration at three granularity levels: project, workflow, and task; supports time parameter calculation and function parameters.
Version Management: Supports version management for events, functions, tasks, and parameters.
Code Management: Provides unified management, import and export for code.
Orchestration Scheduling: Manages task process orchestration and submission scheduling.
Scheduling Method: Supports cyclic, one-time, and event-triggered scheduling, offering crontab configuration for cyclic scheduling.
Dependency Policy: Supports task self-dependency and workflow self-dependency.
Cross-cycle Dependency Configuration: Provides cross-cycle dependency configuration and custom dependency configuration, with the range of upstream and downstream dependency instances being selectable as needed.
Batch Orchestration: Provides the capability to create tasks and dependencies in bulk via Excel, speeding up the efficiency of task dependency orchestration.
Release and Operations: Allows for the deployment of developed tasks to the production environment as needed, and provides unified monitoring and operations of tasks.
Task Release: Supports the release and deployment of development outcomes.
Monitoring and Operations: Manages task process orchestration and submission scheduling.
Analytical Exploration: An intelligent and user-friendly data development approach enhances the efficiency of collaborative task development, helps users view the task processing procedure, and effectively improves the efficiency of ad-hoc data exploration.
Online Editing: Provides a visual interactive analysis IDE.
Run: Offers visualization of execution information.
Development Assistance: Provides efficiency tools for development assistance.
Data Quality
Through flexible rule configuration, comprehensive task management, and multidimensional quality assessment, it provides comprehensive data quality audit capabilities at every stage of the full lifecycle from data access, integration, and processing to consumption.
Multi-source Data Monitoring: Supports monitored data sources and engine types, including EMR Hive, Spark, DLC (public cloud), CDW-PG, TBDS, Gbase (private cloud), etc., offering full data validation capabilities for multi-source data.
Rich Rule Templates: Currently offers 56 common industry-standard table-level and field-level built-in rule templates across six dimensions, truly enabling it out-of-the-box, significantly improving quality control workflow efficiency, and helping users perceive data changes and problematic data generated during the ETL process from various dimensions.
Flexible Quality Control Configuration: Supports system quality rule templates, custom templates, and custom SQL among three rule creation modes, allowing for adjustment of parameters and configuration of task execution policies to easily achieve end-to-end quality control validation.
Global Link Guarantee: Supports related production scheduling and offline periodic detection as execution methods, providing pre-, during, and post-full-link data guarantee operational capabilities, with timely alarms, and interdiction blocking, preventing the downstream spread of dirty data.
Governance Multi-dimensional Visibility: Quality overview and quality report modules provide users with a global perspective, allowing users to be familiar with the operation status of quality tasks, alarm-blocking trends, and quality scores across various dimensions, quickly identify and locate problems, and understand the quality improvement effect.
Data Security
Provides centralized data security management and control and collaboration mechanisms, ensuring the secure and effective circulation of data.
Unified Data Security Management and Control: Integrates security policies deeply with the bound storage and computation engines, unifying data access and simplifying the data usage process.
Permission Approval: Integrates with the ranger permission policy system to realize responsibility assigned to individuals and permission control capability down to the table data granularity. Offers permission application and approval channels to safely open data access control.
Data Operations
Based on powerful underlying metadata capabilities, it offers data asset services such as data catalog, lineage analysis, popularity analysis, asset rating, business classification, and tag management, effectively enhancing the users’ understanding, control, and collaboration ability with enterprise-level mass data.
Data Discovery: Unified metadata collection and management.
Data Overview: Provides statistics of data assets, including items, tables, storage volume, data type coverage, as well as data panorama and popular ranking features.
Data Catalog: Supports global database table level, field-level quick search and localization; table details provide full data technology, business information, and features like data lineage, temperature, quality, output and changes, preview, etc.
Database Table Management: Supports management of global database tables.
Business Classification: Supports creating and managing thematic categories, data warehouse layering, and business tags based on business needs, and conducting batch categorization and hierarchical operations on database tables.
Data Services
Provides capabilities covering the full lifecycle of APIs, including API production, API management, and API marketplace, helping enterprises to unify the management of internal and external API services and build a unified data service bus.
Quick API production.
API management and operations.
API security call.
Was this page helpful?