VALUES LESS THAN (...). The system will use the upper bound of the previous partition as the lower bound of the partition, and generate a left-closed and right-open interval. It also supports specifying both the upper and lower bounds by VALUES [...), and generate a left-closed right-open interval.VALUES [...) . Here is an example to illustrate the change of partition range when using VALUES LESS THAN (...) statement for adding or deleting partition .-- Range PartitionCREATE TABLE IF NOT EXISTS example_db.expamle_range_tbl(`user_id` LARGEINT NOT NULL COMMENT "user id",`date` DATE NOT NULL COMMENT "data import date",`timestamp` DATETIME NOT NULL COMMENT "data import timestamp",`city` VARCHAR(20) COMMENT "user's city",`age` SMALLINT COMMENT "user's age",`sex` TINYINT COMMENT "user's gender",`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "last visit time of user",`cost` BIGINT SUM DEFAULT "0" COMMENT "total consumption of user",`max_dwell_time` INT MAX DEFAULT "0" COMMENT "user's maximum stay time",`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "user's minimum stay time")ENGINE=OLAPAGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`)PARTITION BY RANGE(`date`)(PARTITION `p201701` VALUES LESS THAN ("2017-02-01"),PARTITION `p201702` VALUES LESS THAN ("2017-03-01"),PARTITION `p201703` VALUES LESS THAN ("2017-04-01"))DISTRIBUTED BY HASH(`user_id`) BUCKETS 16PROPERTIES("replication_num" = "3");
expamle_range_tbl example, after the table is created, view the partition, and you will find the following three partitions are automatically generated:show partitions from expamle_range_tbl;
p201701: [MIN_VALUE, 2017-02-01)p201702: [2017-02-01, 2017-03-01)p201703: [2017-03-01, 2017-04-01)
alter table expamle_range_tbl add partition p201705 VALUES LESS THAN ("2017-06-01");show partitions from expamle_range_tbl;
p201701: [MIN_VALUE, 2017-02-01)p201702: [2017-02-01, 2017-03-01)p201703: [2017-03-01, 2017-04-01)p201705: [2017-04-01, 2017-06-01)
alter table expamle_range_tbl drop partition p201703;show partitions from expamle_range_tbl;
p201701: [MIN_VALUE, 2017-02-01)p201702: [2017-02-01, 2017-03-01)p201705: [2017-04-01, 2017-06-01)
alter table expamle_range_tbl drop partition p201702;show partitions from expamle_range_tbl;
p201701: [MIN_VALUE, 2017-02-01)p201705: [2017-04-01, 2017-06-01)
alter table expamle_range_tbl add partition p201702new VALUES LESS THAN ("2017-03-01");show partitions from expamle_range_tbl;
p201701: [MIN_VALUE, 2017-02-01)p201702new: [2017-02-01, 2017-03-01)p201705: [2017-04-01, 2017-06-01)
alter table expamle_range_tbl drop partition p201701;alter table expamle_range_tbl add partition p201612 VALUES LESS THAN ("2017-01-01");show partitions from expamle_range_tbl;
p201612: [MIN_VALUE, 2017-01-01)p201702new: [2017-02-01, 2017-03-01)p201705: [2017-04-01, 2017-06-01)
VALUES LESS THAN statement, the lower bound of one partition is the upper bound of its previous partition.-- Range PartitionCREATE TABLE IF NOT EXISTS example_db.expamle_range_multi_partiton_key_tbl(`user_id` LARGEINT NOT NULL COMMENT "user id",`date` DATE NOT NULL COMMENT "data import date",`timestamp` DATETIME NOT NULL COMMENT "data import timestamp",`city` VARCHAR(20) COMMENT "user's city",`age` SMALLINT COMMENT "user's age",`sex` TINYINT COMMENT "user's gender",`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "last visit time of user",`cost` BIGINT SUM DEFAULT "0" COMMENT "total consumption of user",`max_dwell_time` INT MAX DEFAULT "0" COMMENT "user's maximum stay time",`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "user's minimum stay time")ENGINE=OLAPAGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`)PARTITION BY RANGE(`date`, `user_id`)(PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"),PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"),PARTITION `p201703_all` VALUES LESS THAN ("2017-04-01"))DISTRIBUTED BY HASH(`user_id`) BUCKETS 16PROPERTIES("replication_num" = "3");
date (DATE type) and user_id (INT type) as partition columns. The final result of the sample partition is as follows:show partitions from expamle_range_multi_partiton_key_tbl;
* p201701_1000: [(MIN_VALUE, MIN_VALUE), ("2017-02-01", "1000") )* p201702_2000: [("2017-02-01", "1000"), ("2017-03-01", "2000") )* p201703_all: [("2017-03-01", "2000"), ("2017-04-01", MIN_VALUE))
date column by default, so the partition value of the user_id column will be automatically filled with MIN_VALUE. When data is imported, the system will compare them with the partition values in order, and put the data in their corresponding partitions.* Data --> Partition* 2017-01-01, 200 --> p201701_1000* 2017-01-01, 2000 --> p201701_1000* 2017-02-01, 100 --> p201701_1000* 2017-02-01, 2000 --> p201702_2000* 2017-02-15, 5000 --> p201702_2000* 2017-03-01, 2000 --> p201703_all* 2017-03-10, 1 --> p201703_all* 2017-04-01, 1000 --> Unable to import* 2017-05-01, 1000 --> Unable to import
insert into expamle_range_multi_partiton_key_tbl values (200, '2017-01-01', '2017-01-01 12:00:05', 'Beijing', 25, 1, '2017-01-01 12:00:05', 100, 30, 10);insert into expamle_range_multi_partiton_key_tbl values (2000, '2017-01-01', '2017-01-01 16:10:05', 'Shanghai', 33, 1, '2017-01-01 16:10:05', 800, 50, 1);insert into expamle_range_multi_partiton_key_tbl values (200, '2017-02-01', '2017-01-01 16:10:05', 'Guangzhou', 22, 0, '2017-02-01 16:10:05', 80, 200, 1);show partitions from expamle_range_multi_partiton_key_tbl\\G
BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DATE, DATETIME, CHAR, and VARCHAR data types, and the partition values are an enumeration values. Partitions can be only hit if the data is one of the enumeration values in the target partition. List partition supports using VALUES IN (...) to specify the enumeration values contained in each partition.
The following example illustrates how partitions change when adding or deleting a partition.-- List PartitionCREATE TABLE IF NOT EXISTS example_db.expamle_list_tbl(`user_id` LARGEINT NOT NULL COMMENT "user id",`date` DATE NOT NULL COMMENT "data import date",`timestamp` DATETIME NOT NULL COMMENT "data import timestamp",`city` VARCHAR(20) NOT NULL COMMENT "User's city",`age` SMALLINT COMMENT "user's age",`sex` TINYINT COMMENT "user's gender",`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "last visit time of user",`cost` BIGINT SUM DEFAULT "0" COMMENT "total consumption of user",`max_dwell_time` INT MAX DEFAULT "0" COMMENT "user's maximum stay time",`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "user's minimum stay time")ENGINE=olapAGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`)PARTITION BY LIST(`city`)(PARTITION `p_cn` VALUES IN ("Beijing", "Shanghai", "Hong Kong"),PARTITION `p_usa` VALUES IN ("New York", "San Francisco"),PARTITION `p_jp` VALUES IN ("Tokyo"))DISTRIBUTED BY HASH(`user_id`) BUCKETS 16PROPERTIES("replication_num" = "3");
example_list_tbl example, after the table is created, the following three partitions will be automatically generated:show partitions from expamle_list_tbl;
p_cn: ("Beijing", "Shanghai", "Hong Kong")p_usa: ("New York", "San Francisco")p_jp: ("Tokyo")
alter table expamle_list_tbl add partition p_uk VALUES IN ("London");show partitions from expamle_list_tbl;
p_cn: ("Beijing", "Shanghai", "Hong Kong")p_usa: ("New York", "San Francisco")p_jp: ("Tokyo")p_uk: ("London")
alter table expamle_list_tbl drop partition p_jp;show partitions from expamle_list_tbl;
p_cn: ("Beijing", "Shanghai", "Hong Kong")p_usa: ("New York", "San Francisco")p_uk: ("London")
-- List PartitionCREATE TABLE IF NOT EXISTS example_db.expamle_list_multi_partiton_key_tbl(`user_id` LARGEINT NOT NULL COMMENT "user id",`date` DATE NOT NULL COMMENT "data import date",`timestamp` DATETIME NOT NULL COMMENT "data import timestamp",`city` VARCHAR(20) NOT NULL COMMENT "User's city",`age` SMALLINT COMMENT "user's age",`sex` TINYINT COMMENT "user's gender",`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "last visit time of user",`cost` BIGINT SUM DEFAULT "0" COMMENT "total consumption of user",`max_dwell_time` INT MAX DEFAULT "0" COMMENT "user's maximum stay time",`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "user's minimum stay time")ENGINE=olapAGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`)PARTITION BY LIST(`user_id`, `city`)(PARTITION `p1_city` VALUES IN (("1", "Beijing"), ("1", "Shanghai")),PARTITION `p2_city` VALUES IN (("2", "Beijing"), ("2", "Shanghai")),PARTITION `p3_city` VALUES IN (("3", "Beijing"), ("3", "Shanghai")))DISTRIBUTED BY HASH(`user_id`) BUCKETS 16PROPERTIES("replication_num" = "3");
user_id(INT type) and city(VARCHAR type) as partition columns. The final partition obtained in the above example is as follows:show partitions from expamle_list_multi_partiton_key_tbl;
* p1_city: [("1", "Beijing"), ("1", "Shanghai")]* p2_city: [("2", "Beijing"), ("2", "Shanghai")]* p3_city: [("3", "Beijing"), ("3", "Shanghai")]
* Data ---> partition* 1, Beijing ---> p1_city* 1, Shanghai ---> p1_city* 2, Shanghai ---> p2_city* 3, Beijing ---> p3_city* 1, Tianjin ---> Unable to import* 4, Beijing ---> Unable to import
insert into expamle_list_multi_partiton_key_tbl values (1, '2017-01-01', '2017-01-01 12:00:05', 'Beijing', 25, 1, '2017-01-01 12:00:05', 100, 30, 10);show partitions from expamle_list_multi_partiton_key_tbl\\G
DISTRIBUTED ... statement describes the division rules of data within each partition. If Partition is not used, it describes the division rules of the data in the entire table.ADD PARTITION), you can specify the bucket number of the new partition separately. This feature can help you to deal with data reduction or expansion conveniently.SHOW DATA command. Divide the returned result by the number of replicas to get the data volume of the table.Failed to create partition [xxx] . TimeoutFailed to create partition will also be reported, because as mentioned earlier, Doris will create an unmodifiable default partition for a table without specified partition.
Usually, this error occurs because the tablet creation goes wrong in BE. You can follows the steps below for troubleshooting:Failed to create partition log of the corresponding time point. In the log, find a number pair that looks like {10001-10010}. The first number of the pair is the Backend ID, and the second number is the Tablet ID. For example, the above-mentioned pair of numbers means that on Backend ID 10001, the creation of Tablet ID 10010 failed.Too many open files. The number of open file descriptors exceeds the Linux system limit. In this case, you need to modify the open file descriptor limit of the Linux system.tablet_create_timeout_second=xxx and max_create_table_timeout_second=xxx in fe.conf. The default value of tablet_create_timeout_second is 1 second, and the default value of max_create_table_timeout_second is 60 seconds. The overall timeout is min(tablet_create_timeout_second * replication_num, max_create_table_timeout_second), for specific parameter setting, please see FE Configuration Item .max_dynamic_partition_num and max_multi_partition_num on FE limit the number of autonomous partition creation and batch created partitions respectively.replication_allocation and buckets of the table would be used.dynamic_partition.replication_allocation and dynamic_partition.buckets parameters of the table.dynamic_partition.buckets, only the bucket number of the newly created partitions by this modification will be valid, and it will not be applicable to the already created partitions.Feedback