tencent cloud

Feedback

Join and Subquery

Last updated: 2024-01-06 17:34:55
    For distributed instances, the data is sharded horizontally across nodes. To improve the performance, we recommend that you optimize the table structure and SQL statement to avoid cross-node data storage.

    Multiple sharded tables with the same shardkey

    MySQL > select * from test1 join test2 where test1.a=test2.a;
    +---+------+---------+---+------+---------------+
    | a | b | c | a | d | e |
    +---+------+---------+---+------+---------------+
    | 1 | 2 | record1 | 1 | 3 | test2_record1 |
    | 2 | 3 | record2 | 2 | 3 | test2_record2 |
    +---+------+---------+---+------+---------------+
    2 rows in set (0.00 sec)
    
    MySQL > select * from test1 left join test2 on test1.a<test2.a where test1.a=1;
    +---+------+---------+------+------+---------------+
    | a | b | c | a | d | e |
    +---+------+---------+------+------+---------------+
    | 1 | 2 | record1 | 2 | 3 | test2_record2 |
    +---+------+---------+------+------+---------------+
    1 row in set (0.00 sec)
    
    MySQL> select * from test1 where test1.a in (select a from test2);
    +---+------+---------+
    | a | b | c |
    +---+------+---------+
    | 1 | 2 | record1 |
    | 2 | 3 | record2 |
    +---+------+---------+
    2 rows in set (0.00 sec)
    
    MySQL> select a, count(1) from test1 where exists (select * from test2 where test2.a=test1.a) group by a;
    +---+----------+
    | a | count(1) |
    +---+----------+
    | 1 | 1 |
    | 2 | 1 |
    +---+----------+
    2 rows in set (0.00 sec)
    
    MySQL> select distinct count(1) from test1 where exists (select * from test2 where test2.a=test1.a) group by a;
    +----------+
    | count(1) |
    +----------+
    | 1 |
    +----------+
    1 row in set (0.00 sec)
    
    MySQL> select count(distinct a) from test1 where exists (select * from test2 where test2.a=test1.a);
    +-------------------+
    | count(distinct a) |
    +-------------------+
    | 2 |
    +-------------------+
    1 row in set (0.00 sec)

    Non-sharded tables

    mysql> create table noshard_table ( a int, b int key);
    Query OK, 0 rows affected (0.02 sec)
    
    mysql> create table noshard_table_2 ( a int, b int key);
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> select * from noshard_table,noshard_table_2;
    Empty set (0.00 sec)
    
    mysql> insert into noshard_table (a,b) values(1,2),(3,4);
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Duplicates: 0 Warnings: 0
    
    mysql> insert into noshard_table_2 (a,b) values(10,20),(30,40);
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Duplicates: 0 Warnings: 0
    
    mysql> select * from noshard_table,noshard_table_2;
    +------+---+------+----+
    | a | b | a | b |
    +------+---+------+----+
    | 1 | 2 | 10 | 20 |
    | 3 | 4 | 10 | 20 |
    | 1 | 2 | 30 | 40 |
    | 3 | 4 | 30 | 40 |
    +------+---+------+----+
    4 rows in set (0.00 sec)

    Broadcast table

    MySQL> create table global_test(a int key, b int)shardkey=noshardkey_allset;
    Query OK, 0 rows affected (0.00 sec)
    
    MySQL> insert into global_test(a, b) values(1,1),(2,2);
    Query OK, 2 rows affected (0.00 sec)
    
    MySQL> select * from test1, global_test;
    +---+------+---------+---+------+
    | a | b | c | a | b |
    +---+------+---------+---+------+
    | 1 | 2 | record1 | 1 | 1 |
    | 2 | 3 | record2 | 1 | 1 |
    | 1 | 2 | record1 | 2 | 2 |
    | 2 | 3 | record2 | 2 | 2 |
    +---+------+---------+---+------+
    4 rows in set (0.00 sec)

    Derived table where the subquery contains the shardkey

    mysql> select a from (select * from test1 where a=1) as t;
    +---+
    | a |
    +---+
    | 1 |
    +---+
    1 row in set (0.00 sec)
    Note:
    The result can be returned even if the shardkey is not specified in the subquery.

    Complex SQL

    For SQL statements that cannot meet the requirements as recommended, cross-node data interaction is required, which will compromise the performance: This includes:
    Queries with subqueries
    JOIN queries for multiple tables where the partition fields (shardkey) of the participating tables are different or the tables are of different types (such as non-sharded tables and sharded tables).
    For such complex queries, the data that actually participates in the query is extracted from the backend database and stored in the local temp table through condition pushdown, and then the data in the temp table is computed.
    Therefore, you need to explicitly specify the conditions of the tables participating in the query to avoid performance degradation due to the extraction of large amounts of data.
    mysql> create table test1 ( a int key, b int, c char(20) ) shardkey=a;
    Query OK, 0 rows affected (1.56 sec)
    
    mysql> create table test2 ( a int key, d int, e char(20) ) shardkey=a;
    Query OK, 0 rows affected (1.46 sec)
    
    mysql> insert into test1 (a,b,c) values(1,2,"record1"),(2,3,"record2");
    Query OK, 2 rows affected (0.02 sec)
    
    mysql> insert into test2 (a,d,e) values(1,3,"test2_record1"),(2,3,"test2_record2");
    Query OK, 2 rows affected (0.02 sec)
    
    mysql> select * from test1 join test2 on test1.b=test2.d;
    +---+------+---------+---+------+---------------+
    | a | b | c | a | d | e |
    +---+------+---------+---+------+---------------+
    | 2 | 3 | record2 | 1 | 3 | test2_record1 |
    | 2 | 3 | record2 | 2 | 3 | test2_record2 |
    +---+------+---------+---+------+---------------+
    2 rows in set (0.00 sec)
    
    MySQL> select * from test1 where exists (select * from test2 where test2.a=test1.b);
    +---+------+---------+
    | a | b | c |
    +---+------+---------+
    | 1 | 2 | record1 |
    +---+------+---------+
    1 row in set (0.00 sec)
    Distributed instances also support many complex UPDATE, DELETE, and INSERT operations.
    Note that such queries are implemented based on the corresponding SELECT statement, so you also need to load the data to the gateway's temp table. We recommend that you specify the query condition as precisely as possible to avoid the potential performance loss caused by loading high amounts of data. In addition, the gateway won't lock the loaded data by default, which is slightly different from what MySQL will do. To lock the data, you can modify the proxy configuration.
    MySQL [th]> update test1 set test1.c="record" where exists(select 1 from test2 where test1.b=test2.d);
    Query OK, 1 row affected (0.00 sec)
    
    MySQL [th]> update test1, test2 set test1.b=2 where test1.b=test2.d;
    Query OK, 1 row affected (0.00 sec)
    
    MySQL [th]> insert into test1 select cast(rand()*1024 as unsigned), d, e from test2;
    Query OK, 2 rows affected (0.00 sec)
    
    MySQL [th]> delete from test1 where b in (select b from test2);
    Query OK, 6 rows affected (0.00 sec)
    
    MySQL [th]> delete from test2.* using test1 right join test2 on test1.a=test2.a where test1.a is null;
    Query OK, 2 rows affected (0.00 sec)
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support