create external table
, or to a database through create external database
. If there were large amounts of databases or tables in the external data catalog, users would need to create mappings to them one by one, which could be a heavy workload.SHOW DATABASES
and USE DB
. Users can switch Catalogs using the SWITCH command. For example:SWITCH internal;SWITCH hive_catalog;
SHOW DATABASES
and USE DB
. Doris will automatically access the database and table in the catalog. You can view and access data in External Catalogs the same way as doing that in the Internal Catalog. Doris only supports read-only access to data in External Catalogs currently.CREATE CATALOG hive PROPERTIES ('type'='hms','hive.metastore.uris' = 'thrift://172.21.0.1:7004');
SHOW CATALOGS
command:mysql> SHOW CATALOGS;+-----------+-------------+----------+| CatalogId | CatalogName | Type |+-----------+-------------+----------+| 10024 | hive | hms || 0 | internal | internal |+-----------+-------------+----------+
SWITCH
command, and view the databases in it:mysql> SWITCH hive;Query OK, 0 rows affected (0.00 sec)mysql> SHOW DATABASES;+-----------+| Database |+-----------+| default || random || ssb100 || tpch1 || tpch100 || tpch1_orc |+-----------+
mysql> USE tpch100;Database changedmysql> SHOW TABLES;+-------------------+| Tables_in_tpch100 |+-------------------+| customer || lineitem || nation || orders || part || partsupp || region || supplier |+-------------------+
mysql> DESC lineitem;+-----------------+---------------+------+------+---------+-------+| Field | Type | Null | Key | Default | Extra |+-----------------+---------------+------+------+---------+-------+| l_shipdate | DATE | Yes | true | NULL | || l_orderkey | BIGINT | Yes | true | NULL | || l_linenumber | INT | Yes | true | NULL | || l_partkey | INT | Yes | true | NULL | || l_suppkey | INT | Yes | true | NULL | || l_quantity | DECIMAL(15,2) | Yes | true | NULL | || l_extendedprice | DECIMAL(15,2) | Yes | true | NULL | || l_discount | DECIMAL(15,2) | Yes | true | NULL | || l_tax | DECIMAL(15,2) | Yes | true | NULL | || l_returnflag | TEXT | Yes | true | NULL | || l_linestatus | TEXT | Yes | true | NULL | || l_commitdate | DATE | Yes | true | NULL | || l_receiptdate | DATE | Yes | true | NULL | || l_shipinstruct | TEXT | Yes | true | NULL | || l_shipmode | TEXT | Yes | true | NULL | || l_comment | TEXT | Yes | true | NULL | |+-----------------+---------------+------+------+---------+-------+
mysql> SELECT l_shipdate, l_orderkey, l_partkey FROM lineitem limit 10;+------------+------------+-----------+| l_shipdate | l_orderkey | l_partkey |+------------+------------+-----------+| 1998-01-21 | 66374304 | 270146 || 1997-11-17 | 66374304 | 340557 || 1997-06-17 | 66374400 | 6839498 || 1997-08-21 | 66374400 | 11436870 || 1997-08-07 | 66374400 | 19473325 || 1997-06-16 | 66374400 | 8157699 || 1998-09-21 | 66374496 | 19892278 || 1998-08-07 | 66374496 | 9509408 || 1998-10-27 | 66374496 | 4608731 || 1998-07-14 | 66374592 | 13555929 |+------------+------------+-----------+
mysql> SELECT l.l_shipdate FROM hive.tpch100.lineitem l WHERE l.l_partkey IN (SELECT p_partkey FROM internal.db1.part) LIMIT 10;+------------+| l_shipdate |+------------+| 1993-02-16 || 1995-06-26 || 1995-08-19 || 1992-07-23 || 1998-05-23 || 1997-07-12 || 1994-03-06 || 1996-02-07 || 1997-06-01 || 1996-08-23 |+------------+
catalog.database.table
. For example internal.db1.part
in the above snippet.catalog
and database
in the format can be omitted.mysql> SWITCH internal;Query OK, 0 rows affected (0.00 sec)mysql> USE db1;Database changedmysql> INSERT INTO part SELECT * FROM hive.tpch100.part limit 1000;Query OK, 1000 rows affected (0.28 sec){'label':'insert_212f67420c6444d5_9bfc184bf2e7edb8', 'status':'VISIBLE', 'txnId':'4'}
UNION
and INTERVAL
, etc., Doris maps them to an UNSUPPORTED type. Here are examples of queries in a table containing UNSUPPORTED types:k1 INT,k2 INT,k3 UNSUPPORTED,k4 INT
select * from table; // Error: Unsupported type 'UNSUPPORTED_TYPE' in '`k3`select * except(k3) from table; // Query OK.select k1, k3 from table; // Error: Unsupported type 'UNSUPPORTED_TYPE' in '`k3`select k1, k4 from table; // Query OK.
include_database_list
and exclude_database_list
in the Catalog configuration.include_database_list
: Only synchronize the specified databases, separated by ','. Default value is '', meaning to synchronize all databases. db name is case sensitive.exclude_database_list
: Specify databases that do not need to be synchronized, separated by ','. Default value is '', meaning no filters takes effect, and synchronizing all databases. db name is case sensitive.include_database_list
and exclude_database_list
configurations overlap, theexclude_database_list
will take effect first.only_specified_database
. For more information, see JDBC Catalog.Event | Event Behavior and Corresponding Operation |
CREATE DATABASE | Create a database in the corresponding catalog. |
DROP DATABASE | Delete a database in the corresponding catalog. |
ALTER DATABASE | The impact of such alterations mainly includes changes in properties, comments, or default storage location of the database. These changes do not affect Doris' queries in External Catalogs so they will not be synchronized. |
CREATE TABLE | Create a table in the corresponding database. |
DROP TABLE | Delete a table in the corresponding database, and invalidate the cache o that table. |
ALTER TABLE | If it is a renaming, delete the table of the old name, and then create a new table with the new name. Otherwise, invalidate the cache of that table. |
ADD PARTITION | Add a partition to the cached partition list of the corresponding table. |
DROP PARTITION | Delete a partition from the cached partition list of the corresponding table, and invalidate the cache of that partition. |
ALTER PARTITION | If it is a renaming, delete the partition of the old name, and then create a new partition with the new name. Otherwise, invalidate the cache of that partition. |
enable_hms_events_incremental_sync
: This specifies whether to enable the automatic incremental synchronization for metadata, which is disabled by default.hms_events_polling_interval_ms
: This specifies the interval between two reading events, which is set to 10000 by default. (Unit: millisecond).hms_events_batch_size_per_rpc
: This specifies the maximum number of events that are read at a time, which is set to 500 by default.<property><name>hive.metastore.event.db.notification.api.auth</name><value>false</value></property><property><name>hive.metastore.dml.events</name><value>true</value></property><property><name>hive.metastore.transactional.event.listeners</name><value>org.apache.hive.hcatalog.listener.DbNotificationListener</value></property>
enable_hms_events_incremental_sync
to true, and then restart the FE node. You do not need to manually update the metadata before or after the restart.metadata_refresh_interval_sec
in properties, in seconds. If this parameter is set when creating a Catalog, the master node of FE will refresh the Catalog regularly according to the parameter value. Currently, three types are supported:-- Set catalog refresh interval to 20 secondsCREATE CATALOG es PROPERTIES ("type"="es","hosts"="http://127.0.0.1:9200","metadata_refresh_interval_sec"="20");
Was this page helpful?