tencent cloud

文档反馈

JOIN 和子查询

最后更新时间:2024-01-06 12:31:07
    对于分布式实例,数据水平拆分在各个节点,为提高性能,建议优先优化表结构和 SQL,尽量使用不跨节点的方式。

    推荐方式

    多个分表,带有分表键相等的条件

    MySQL > select * from test1 join test2 where test1.a=test2.a;
    +---+------+---------+---+------+---------------+
    | a | b | c | a | d | e |
    +---+------+---------+---+------+---------------+
    | 1 | 2 | record1 | 1 | 3 | test2_record1 |
    | 2 | 3 | record2 | 2 | 3 | test2_record2 |
    +---+------+---------+---+------+---------------+
    2 rows in set (0.00 sec)
    
    MySQL > select * from test1 left join test2 on test1.a<test2.a where test1.a=1;
    +---+------+---------+------+------+---------------+
    | a | b | c | a | d | e |
    +---+------+---------+------+------+---------------+
    | 1 | 2 | record1 | 2 | 3 | test2_record2 |
    +---+------+---------+------+------+---------------+
    1 row in set (0.00 sec)
    
    MySQL> select * from test1 where test1.a in (select a from test2);
    +---+------+---------+
    | a | b | c |
    +---+------+---------+
    | 1 | 2 | record1 |
    | 2 | 3 | record2 |
    +---+------+---------+
    2 rows in set (0.00 sec)
    
    MySQL> select a, count(1) from test1 where exists (select * from test2 where test2.a=test1.a) group by a;
    +---+----------+
    | a | count(1) |
    +---+----------+
    | 1 | 1 |
    | 2 | 1 |
    +---+----------+
    2 rows in set (0.00 sec)
    
    MySQL> select distinct count(1) from test1 where exists (select * from test2 where test2.a=test1.a) group by a;
    +----------+
    | count(1) |
    +----------+
    | 1 |
    +----------+
    1 row in set (0.00 sec)
    
    MySQL> select count(distinct a) from test1 where exists (select * from test2 where test2.a=test1.a);
    +-------------------+
    | count(distinct a) |
    +-------------------+
    | 2 |
    +-------------------+
    1 row in set (0.00 sec)

    均为单表

    mysql> create table noshard_table ( a int, b int key);
    Query OK, 0 rows affected (0.02 sec)
    
    mysql> create table noshard_table_2 ( a int, b int key);
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> select * from noshard_table,noshard_table_2;
    Empty set (0.00 sec)
    
    mysql> insert into noshard_table (a,b) values(1,2),(3,4);
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Duplicates: 0 Warnings: 0
    
    mysql> insert into noshard_table_2 (a,b) values(10,20),(30,40);
    Query OK, 2 rows affected (0.00 sec)
    Records: 2 Duplicates: 0 Warnings: 0
    
    mysql> select * from noshard_table,noshard_table_2;
    +------+---+------+----+
    | a | b | a | b |
    +------+---+------+----+
    | 1 | 2 | 10 | 20 |
    | 3 | 4 | 10 | 20 |
    | 1 | 2 | 30 | 40 |
    | 3 | 4 | 30 | 40 |
    +------+---+------+----+
    4 rows in set (0.00 sec)

    广播表

    MySQL> create table global_test(a int key, b int)shardkey=noshardkey_allset;
    Query OK, 0 rows affected (0.00 sec)
    
    MySQL> insert into global_test(a, b) values(1,1),(2,2);
    Query OK, 2 rows affected (0.00 sec)
    
    MySQL> select * from test1, global_test;
    +---+------+---------+---+------+
    | a | b | c | a | b |
    +---+------+---------+---+------+
    | 1 | 2 | record1 | 1 | 1 |
    | 2 | 3 | record2 | 1 | 1 |
    | 1 | 2 | record1 | 2 | 2 |
    | 2 | 3 | record2 | 2 | 2 |
    +---+------+---------+---+------+
    4 rows in set (0.00 sec)

    子查询带有 shardkey 的 derived table

    mysql> select a from (select * from test1 where a=1) as t;
    +---+
    | a |
    +---+
    | 1 |
    +---+
    1 row in set (0.00 sec)
    说明:
    子查询时不指定 shardkey,也可查询结果。

    复杂 SQL

    对于不能满足推荐方式的 SQL,由于需要做跨节点的数据交互,所以性能会差一些。 包括:
    包含子查询的查询。
    多表的 join 查询,且参与查询的各表的分区字段(shardkey)不相等,或者同时涉及不同类型的表,例如单表和分表。
    对于这类复杂查询,通过条件下推,将真正参与查询的数据从后端数据库中抽取出来,存放在本地的临时表中,然后对临时表中的数据进行计算。
    因此用户需要明确指定参与查询的表的条件,避免因为抽取大量数据而性能受损。
    mysql> create table test1 ( a int key, b int, c char(20) ) shardkey=a;
    Query OK, 0 rows affected (1.56 sec)
    
    mysql> create table test2 ( a int key, d int, e char(20) ) shardkey=a;
    Query OK, 0 rows affected (1.46 sec)
    
    mysql> insert into test1 (a,b,c) values(1,2,"record1"),(2,3,"record2");
    Query OK, 2 rows affected (0.02 sec)
    
    mysql> insert into test2 (a,d,e) values(1,3,"test2_record1"),(2,3,"test2_record2");
    Query OK, 2 rows affected (0.02 sec)
    
    mysql> select * from test1 join test2 on test1.b=test2.d;
    +---+------+---------+---+------+---------------+
    | a | b | c | a | d | e |
    +---+------+---------+---+------+---------------+
    | 2 | 3 | record2 | 1 | 3 | test2_record1 |
    | 2 | 3 | record2 | 2 | 3 | test2_record2 |
    +---+------+---------+---+------+---------------+
    2 rows in set (0.00 sec)
    
    MySQL> select * from test1 where exists (select * from test2 where test2.a=test1.b);
    +---+------+---------+
    | a | b | c |
    +---+------+---------+
    | 1 | 2 | record1 |
    +---+------+---------+
    1 row in set (0.00 sec)
    分布式实例还支持丰富的复杂 update/delete/insert 操作。
    需要注意的是,这类查询是在与之对应的 select 基础上实现的,因此也需要将数据加载至网关临时表,建议用户尽量在查询中指定明确的查询条件,避免大量数据的加载带来性能损耗。 另外,网关在加载数据时默认不会对加载的数据进行上锁,这与官方的 MySQL 行为存在略微的差异;如需加锁可以通过修改 proxy 配置来实现。
    MySQL [th]> update test1 set test1.c="record" where exists(select 1 from test2 where test1.b=test2.d);
    Query OK, 1 row affected (0.00 sec)
    
    MySQL [th]> update test1, test2 set test1.b=2 where test1.b=test2.d;
    Query OK, 1 row affected (0.00 sec)
    
    MySQL [th]> insert into test1 select cast(rand()*1024 as unsigned), d, e from test2;
    Query OK, 2 rows affected (0.00 sec)
    
    MySQL [th]> delete from test1 where b in (select b from test2);
    Query OK, 6 rows affected (0.00 sec)
    
    MySQL [th]> delete from test2.* using test1 right join test2 on test1.a=test2.a where test1.a is null;
    Query OK, 2 rows affected (0.00 sec)
    联系我们

    联系我们,为您的业务提供专属服务。

    技术支持

    如果你想寻求进一步的帮助,通过工单与我们进行联络。我们提供7x24的工单服务。

    7x24 电话支持