tencent cloud

Feedback

Materialized View

Last updated: 2024-07-31 18:03:31
    Note:
    Currently, DLC materialized views only support the SparkSQL and Presto engines.
    A Materialized View is a special object in a database, which is a pre-calculated and stored query result set. Materialized views can provide fast query performance when dealing with large amounts of data and complex queries.
    While materialized views improve query performance, they also introduce storage and compute costs. We recommend using materialized views in the following scenarios:
    Source table changes are infrequent
    Compared to the source table, the fields and result quantity in the materialized view table are significantly reduced
    DLC supports both regular materialized views and mapped materialized views. Below is an introduction and a complete set of usage examples. The supported syntax list can be found in Materialized View Syntax.

    Regular Materialized View

    The basic usage process of a regular materialized view includes creation, refresh, and use.
    The following is a complete process example based on the Presto Engine.

    Data Preparations

    Execute SQL to create a database table and insert data. The following statement creates a table named student.
    CREATE DATABASE IF NOT EXISTS mv_test3;
    create table student(id int, name string, score int);
    insert into student values (1,'zhangsan', 90);
    insert into student values (2,'lisi', 100);
    insert into student values (3,'wangwu', 80);
    insert into student values (4,'zhaoliu', 30);
    select * from student order by id;

    Creating a Regular Materialized View

    Use the CREATE MATERIALIZED VIEW statement to create a materialized view. Specify the name of the materialized view and the query statement, and optionally specify the source table and condition of the query.
    In the following example, a simple SELECT statement is used to select all scores from the table student and perform a summation operation on them. Then, this summation result serves as the content of the materialized view mv_student_sum.
    CREATE MATERIALIZED VIEW mv_student_sum AS (
    select sum(score) from student
    );

    Viewing Materialized View Details

    Use the DESCRIBE MATERIALIZED VIEW statement to view the detailed information of the materialized view, including the name, query statement, and refresh status, among others.
    DESCRIBE MATERIALIZED VIEW mv_student_sum;
    
    
    

    Manual Refresh of Materialized View

    Use the REFRESH MATERIALIZED VIEW statement to manually refresh the data of the materialized view.
    This demonstration is just for illustrative purposes; in most cases, you do not need to manually refresh the materialized view. The view will automatically refresh if the SQL hits a source table that has changed.
    REFRESH MATERIALIZED VIEW mv_student_sum;

    Viewing the Execute Task List of Materialized View

    Use the SHOW MATERIALIZED VIEW JOBS statement to view the execute task list of the materialized view, enabling you to understand the refresh history and status of the materialized view.
    SHOW MATERIALIZED VIEW JOBS IN mv_student_sum;
    
    
    

    SQL Rewrite Execution

    When querying data with the SELECT statement, it is expected to be automatically rewritten and hit the materialized view. You can check if it was automatically rewritten to the materialized view through the statistics in the query results.
    select sum(score) from student;
    

    Dropping Materialized View

    DROP MATERIALIZED VIEW mv_student_sum;

    Mapped Materialized View

    Mapped Materialized View is a special type of materialized view that is associated with an existing table through mapping. By using a mapped materialized view, you can associate the query results of the materialized view with the data of an existing table, thereby optimizing the query performance of the existing table.

    Limit

    Materialized views have the following restrictions compared to ordinary materialized views:
    Mapped Materialized Views do not support refresh operations, meaning it's not possible to refresh the data of the materialized view through the REFRESH MATERIALIZED VIEW statement. As a result, the data of the materialized view can only remain consistent with that of the mapped table and cannot be automatically updated.
    Mapped Materialized Views do not perform automatic SQL rewriting, meaning query statements are not automatically converted to use the materialized view. It is necessary to manually specify the query statements that use the materialized view.
    When a Mapped Materialized View is deleted, only the association relationship with the mapped table is removed, not the mapped table itself. The mapped table still exists and can continue to be used.

    Recommended scenarios

    It is recommended to use Mapped Materialized Views in the following scenarios:
    When there is already a large table with low query performance, query performance can be optimized through the use of Mapped Materialized Views.
    When it is necessary to keep the data of the materialized view consistent with that of an existing table, and there is no need for automatic refresh of the materialized view, Mapped Materialized Views can be used.

    Iceberg type of source table

    When an Iceberg table is the source table, a complete example is as follows:

    Creating a Mapped Materialized View based on CTAS

    The mapped materialized view needs to maintain consistent naming with the table to be mapped. The following example first creates a table based on CTAS for the creation of the Mapping MV. The data preparation can refer to the section on data preparation in the complete example of an ordinary materialized view.
    CREATE TABLE link_mv_student AS (
    select sum(score) from student
    );
    --Create Mapped Materialized View: Use the CREATE MATERIALIZED VIEW statement to create a mapped materialized view.
    --When creating the materialized view, use the WITH META LINK clause and specify the name of the mapping table as the association.
    CREATE MATERIALIZED VIEW link_mv_student WITH META LINK AS (
    select sum(score) from student
    );

    View Mapping Materialized View

    Using the DESCRIBE MATERIALIZED VIEW statement, you can view the detailed information of the mapped materialized view, including the name, query statement, and refresh status, among others.
    DESCRIBE MATERIALIZED VIEW link_mv_student;
    SHOW MATERIALIZED VIEW JOBS IN link_mv_student;

    Mapped Materialized Views do not support refresh operations

    Mapped Materialized Views do not support REFRESH operations, meaning it's not possible to refresh the data of the materialized view through the REFRESH MATERIALIZED VIEW statement. As a result, the data of the materialized view can only remain consistent with that of the mapping table and cannot be automatically updated.

    SQL Rewriting

    Mapped Materialized Views do not automatically perform SQL rewriting of query statements.
    Executing select sum(score) from student; will not hit the mapped materialized view.
    SQL rewriting based on the mapped materialized view can be specified by using the Hint or TaskConf parameters.
    --Manually Specify the SQL to be rewritten
    select /*+ OPTIONS('eos.sql.materializedView.enableRewrite'='true') */
    sum(score) from student;

    Delete mapped materialized view

    Use the DROP MATERIALIZED VIEW statement to delete the mapped materialized view. After deleting the mapped materialized view, only the association with the mapping table will be deleted, the mapping table itself will still exist.
    DROP MATERIALIZED VIEW link_mv_student;
    DESCRIBE link_mv_student; --The source table still exists

    Hive type source table

    When the Hive table is the source table, a complete example is as follows:

    Prepare to initialize data

    First, you need to prepare the initial data and create a Hive base table. Use the CREATE EXTERNAL TABLE statement to create a Hive base table, and manually insert data via the INSERT statement.
    CREATE EXTERNAL TABLE student_2(id int, name string, score int)
    LOCATION 'cosn://guangzhou-test-1305424723/mv_test4/student_2';
    insert into student_2 values (1,'zhangsan', 90);
    insert into student_2 values (2,'lisi', 100);
    insert into student_2 values (3,'wangwu', 80);
    insert into student_2 values (4,'zhaoliu', 30);
    select * from student_2;

    Create a mapped Hive external table

    Use the CREATE EXTERNAL TABLE statement to create a mapped Hive external table.
    CREATE EXTERNAL TABLE link_mv_student_hive (
    sum_score BIGINT
    ) LOCATION 'cosn://guangzhou-test-1305424723/mv_test4/link_mv_student_hive';
    
    To insert data into the mapping table, use the INSERT OVERWRITE statement to insert the query results into the mapping table, ensuring the data in the mapping table is consistent with the data in the Hive base table.
    --Insert data into the mapping table
    INSERT OVERWRITE link_mv_student_hive 
    select sum(score) from student;

    Create a mapped materialized view based on the Hive external table

    Use the CREATE MATERIALIZED VIEW statement to create a mapped materialized view. When creating a materialized view, use the WITH META LINK clause and specify the name of the above Hive external table for association.
    CREATE MATERIALIZED VIEW link_mv_student_hive WITH META LINK AS (
    select sum(score) from student_2
    );
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support