tencent cloud

Feedback

Testing Scheme Introduction

Last updated: 2024-12-25 17:55:17
    This document introduces how to use TPC-DS to perform performance testing on TCHouse-D. The following section provides a reference testing scheme using a 100 GB data set as an example to evaluate the TPC-DS inquiry performance.

    About TPC-DS Performance Testing

    TPC-DS is a benchmark test with a focus on decision-making support, designed to evaluate the performance of data warehouses and analytical systems. Developed by the TPC (Transaction Processing Performance Council), it is used to compare different systems' capabilities in handling complex inquiries and large-scale data analysis.
    The design goal of TPC-DS is to simulate complex decision-making support workloads found in real-world scenarios. It tests system performance through a series of complex inquiries and data operations, including joins, aggregations, sorts, filters, sub-inquiries, and more. These inquiry patterns cover a range of simple and complex scenarios, such as report generation, data mining, and OLAP (online analytical processing).

    Testing Scheme Introduction

    Test Environment Preparation

    Hardware Environment

    In the reference scheme provided in this document, the tested cluster consists of 3 FEs and 3 BEs, with the FE and BE node processes deployed separately. The specific specifications are shown below. Please note that in actual testing, such an extensive amount of hardware resources would not be consumed.
    Node Type
    Node Specifications
    3 FEs, standard
    CPU: 16 cores
    Memory: 64 GB
    Hard disk: Enhanced SSD 200 GB
    3 BEs, standard
    CPU: 16 cores
    Memory: 64 GB
    Hard disk: Enhanced SSD 1000 GB

    Software Version

    Tencent Cloud TChouse-D 2.1.7

    Test Script Preparation

    Download the TPC-DS toolkit from Toolkit Address and compile it.

    TPC-DS 100 G Data Testing

    Generate a 100 G data set.

    sh bin/gen-tpcds-data.sh -s 100
    The data generated after execution is as follows:
    # du -sh bin/tpcds-data/
    96G bin/tpcds-data/
    Table Name
    Original Text File Size
    Size After Importing 100 G
    Number of Buckets
    Number of Rows
    call_center
    9.2 KiB
    13.784 KB
    1
    30
    catalog_page
    2.8 MiB
    1.216 MB
    3
    20400
    catalog_returns
    2.2 GiB
    736.137 MB
    32
    14404374
    catalog_sales
    29 GiB
    9.225 GB
    960
    143997065
    customer
    256 MiB
    111.185 MB
    12
    2000000
    customer_address
    106 MiB
    21.386 MB
    12
    1000000
    customer_demographics
    76 MiB
    6.468 MB
    12
    1920800
    date_dim
    9.8 MiB
    1.823 MB
    12
    73049
    dbgen_version
    111 B
    1.184 KB
    1
    1
    household_demographics
    142 KiB
    20.372 KB
    3
    7200
    income_band
    308 B
    724.000 B
    1
    20
    inventory
    7.7 GiB
    871.378 MB
    32
    399330000
    item
    56 MiB
    25.314 MB
    12
    204000
    promotion
    122 KiB
    73.989 KB
    1
    1000
    reason
    1.9 KiB
    7.748 KB
    1
    55
    ship_mode
    1.1 KiB
    3.251 KB
    1
    20
    store
    104 KiB
    54.449 KB
    1
    402
    store_returns
    3.3 GiB
    1.090 GB
    32
    28795080
    store_sales
    38 GiB
    12.529 GB
    960
    287997024
    time_dim
    4.8 MiB
    1.087 MB
    12
    86400
    warehouse
    1.8 KiB
    4.999 KB
    1
    15
    web_page
    193 KiB
    38.753 KB
    1
    2040
    web_returns
    998 MiB
    350.227 MB
    32
    7197670
    web_sales
    15 GiB
    4.645 GB
    960
    72001237
    web_site
    6.7 KiB
    11.185 KB
    1
    24
    Total
    96 G
    29.566 GB
    3096
    959037906

    Creating a Table

    Modify the doris-cluster.conf configuration file.
    Modify the configuration: FE_HOST, PASSWORD, and DB.
    # cat doris-cluster.conf
    
    # Any of FE host
    export FE_HOST='127.0.0.1'
    # http_port in fe.conf
    export FE_HTTP_PORT=8030
    # query_port in fe.conf
    export FE_QUERY_PORT=9030
    # Doris username
    export USER='root'
    # Doris password
    export PASSWORD=''
    # The database where TPC-DS tables located
    export DB='tpch_100g'
    Creating a Table
    sh bin/create-tpcds-tables.sh -s 100

    Importing the Data

    sh load-tpcds-data.sh
    
    Start time: Thu Oct 31 21:03:55 CST 2024
    End time: Thu Oct 31 21:14:44 CST 2024
    Finish load tpcds data, Time taken: 649 seconds
    ============================================
    analyze database tpcds_100g
    analyze database tpcds_100g with full with sync;
    analyze database tpcds_100g with full with sync total time: 67 s
    
    MySQL [tpcds_100g]> show data;
    +------------------------+-------------+--------------+------------+
    | TableName | Size | ReplicaCount | RemoteSize |
    +------------------------+-------------+--------------+------------+
    | call_center | 13.784 KB | 1 | 0.000 |
    | catalog_page | 1.216 MB | 3 | 0.000 |
    | catalog_returns | 736.137 MB | 32 | 0.000 |
    | catalog_sales | 9.225 GB | 960 | 0.000 |
    | customer | 111.185 MB | 12 | 0.000 |
    | customer_address | 21.386 MB | 12 | 0.000 |
    | customer_demographics | 6.468 MB | 12 | 0.000 |
    | date_dim | 1.823 MB | 12 | 0.000 |
    | dbgen_version | 1.184 KB | 1 | 0.000 |
    | household_demographics | 20.372 KB | 3 | 0.000 |
    | income_band | 724.000 B | 1 | 0.000 |
    | inventory | 871.378 MB | 32 | 0.000 |
    | item | 25.314 MB | 12 | 0.000 |
    | promotion | 73.989 KB | 1 | 0.000 |
    | reason | 7.748 KB | 1 | 0.000 |
    | ship_mode | 3.251 KB | 1 | 0.000 |
    | store | 54.449 KB | 1 | 0.000 |
    | store_returns | 1.090 GB | 32 | 0.000 |
    | store_sales | 11.713 GB | 960 | 0.000 |
    | time_dim | 1.087 MB | 12 | 0.000 |
    | warehouse | 4.999 KB | 1 | 0.000 |
    | web_page | 38.753 KB | 1 | 0.000 |
    | web_returns | 350.227 MB | 32 | 0.000 |
    | web_sales | 4.645 GB | 960 | 0.000 |
    | web_site | 11.185 KB | 1 | 0.000 |
    | Total | 28.750 GB | 3096 | 0.000 |
    | Quota | 1024.000 TB | 1073741824 | |
    | Left | 1023.972 TB | 1073738728 | |
    | Transaction Quota | 1000 | 1000 | |
    +------------------------+-------------+--------------+------------+
    29 rows in set (0.02 sec)

    Querying

    # bash bin/run-tpcds-queries.sh -s 100
    Thus, the process of TPC-DS data generation, table creation, data import, and inquiry for a 100 GB data set scenario is now complete.
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support