tencent cloud

All product documents
Tencent Cloud TCHouse-D
Testing Scheme Introduction
Last updated: 2024-12-25 17:55:17
Testing Scheme Introduction
Last updated: 2024-12-25 17:55:17
This document introduces how to use TPC-DS to perform performance testing on TCHouse-D. The following section provides a reference testing scheme using a 100 GB data set as an example to evaluate the TPC-DS inquiry performance.

About TPC-DS Performance Testing

TPC-DS is a benchmark test with a focus on decision-making support, designed to evaluate the performance of data warehouses and analytical systems. Developed by the TPC (Transaction Processing Performance Council), it is used to compare different systems' capabilities in handling complex inquiries and large-scale data analysis.
The design goal of TPC-DS is to simulate complex decision-making support workloads found in real-world scenarios. It tests system performance through a series of complex inquiries and data operations, including joins, aggregations, sorts, filters, sub-inquiries, and more. These inquiry patterns cover a range of simple and complex scenarios, such as report generation, data mining, and OLAP (online analytical processing).

Testing Scheme Introduction

Test Environment Preparation

Hardware Environment

In the reference scheme provided in this document, the tested cluster consists of 3 FEs and 3 BEs, with the FE and BE node processes deployed separately. The specific specifications are shown below. Please note that in actual testing, such an extensive amount of hardware resources would not be consumed.
Node Type
Node Specifications
3 FEs, standard
CPU: 16 cores
Memory: 64 GB
Hard disk: Enhanced SSD 200 GB
3 BEs, standard
CPU: 16 cores
Memory: 64 GB
Hard disk: Enhanced SSD 1000 GB

Software Version

Tencent Cloud TChouse-D 2.1.7

Test Script Preparation

Download the TPC-DS toolkit from Toolkit Address and compile it.

TPC-DS 100 G Data Testing

Generate a 100 G data set.

sh bin/gen-tpcds-data.sh -s 100
The data generated after execution is as follows:
# du -sh bin/tpcds-data/
96G bin/tpcds-data/
Table Name
Original Text File Size
Size After Importing 100 G
Number of Buckets
Number of Rows
call_center
9.2 KiB
13.784 KB
1
30
catalog_page
2.8 MiB
1.216 MB
3
20400
catalog_returns
2.2 GiB
736.137 MB
32
14404374
catalog_sales
29 GiB
9.225 GB
960
143997065
customer
256 MiB
111.185 MB
12
2000000
customer_address
106 MiB
21.386 MB
12
1000000
customer_demographics
76 MiB
6.468 MB
12
1920800
date_dim
9.8 MiB
1.823 MB
12
73049
dbgen_version
111 B
1.184 KB
1
1
household_demographics
142 KiB
20.372 KB
3
7200
income_band
308 B
724.000 B
1
20
inventory
7.7 GiB
871.378 MB
32
399330000
item
56 MiB
25.314 MB
12
204000
promotion
122 KiB
73.989 KB
1
1000
reason
1.9 KiB
7.748 KB
1
55
ship_mode
1.1 KiB
3.251 KB
1
20
store
104 KiB
54.449 KB
1
402
store_returns
3.3 GiB
1.090 GB
32
28795080
store_sales
38 GiB
12.529 GB
960
287997024
time_dim
4.8 MiB
1.087 MB
12
86400
warehouse
1.8 KiB
4.999 KB
1
15
web_page
193 KiB
38.753 KB
1
2040
web_returns
998 MiB
350.227 MB
32
7197670
web_sales
15 GiB
4.645 GB
960
72001237
web_site
6.7 KiB
11.185 KB
1
24
Total
96 G
29.566 GB
3096
959037906

Creating a Table

Modify the doris-cluster.conf configuration file.
Modify the configuration: FE_HOST, PASSWORD, and DB.
# cat doris-cluster.conf

# Any of FE host
export FE_HOST='127.0.0.1'
# http_port in fe.conf
export FE_HTTP_PORT=8030
# query_port in fe.conf
export FE_QUERY_PORT=9030
# Doris username
export USER='root'
# Doris password
export PASSWORD=''
# The database where TPC-DS tables located
export DB='tpch_100g'
Creating a Table
sh bin/create-tpcds-tables.sh -s 100

Importing the Data

sh load-tpcds-data.sh

Start time: Thu Oct 31 21:03:55 CST 2024
End time: Thu Oct 31 21:14:44 CST 2024
Finish load tpcds data, Time taken: 649 seconds
============================================
analyze database tpcds_100g
analyze database tpcds_100g with full with sync;
analyze database tpcds_100g with full with sync total time: 67 s

MySQL [tpcds_100g]> show data;
+------------------------+-------------+--------------+------------+
| TableName | Size | ReplicaCount | RemoteSize |
+------------------------+-------------+--------------+------------+
| call_center | 13.784 KB | 1 | 0.000 |
| catalog_page | 1.216 MB | 3 | 0.000 |
| catalog_returns | 736.137 MB | 32 | 0.000 |
| catalog_sales | 9.225 GB | 960 | 0.000 |
| customer | 111.185 MB | 12 | 0.000 |
| customer_address | 21.386 MB | 12 | 0.000 |
| customer_demographics | 6.468 MB | 12 | 0.000 |
| date_dim | 1.823 MB | 12 | 0.000 |
| dbgen_version | 1.184 KB | 1 | 0.000 |
| household_demographics | 20.372 KB | 3 | 0.000 |
| income_band | 724.000 B | 1 | 0.000 |
| inventory | 871.378 MB | 32 | 0.000 |
| item | 25.314 MB | 12 | 0.000 |
| promotion | 73.989 KB | 1 | 0.000 |
| reason | 7.748 KB | 1 | 0.000 |
| ship_mode | 3.251 KB | 1 | 0.000 |
| store | 54.449 KB | 1 | 0.000 |
| store_returns | 1.090 GB | 32 | 0.000 |
| store_sales | 11.713 GB | 960 | 0.000 |
| time_dim | 1.087 MB | 12 | 0.000 |
| warehouse | 4.999 KB | 1 | 0.000 |
| web_page | 38.753 KB | 1 | 0.000 |
| web_returns | 350.227 MB | 32 | 0.000 |
| web_sales | 4.645 GB | 960 | 0.000 |
| web_site | 11.185 KB | 1 | 0.000 |
| Total | 28.750 GB | 3096 | 0.000 |
| Quota | 1024.000 TB | 1073741824 | |
| Left | 1023.972 TB | 1073738728 | |
| Transaction Quota | 1000 | 1000 | |
+------------------------+-------------+--------------+------------+
29 rows in set (0.02 sec)

Querying

# bash bin/run-tpcds-queries.sh -s 100
Thus, the process of TPC-DS data generation, table creation, data import, and inquiry for a 100 GB data set scenario is now complete.

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 available.

7x24 Phone Support