tencent cloud

Feedback

ODBC External Table

Last updated: 2024-07-04 10:15:36
    Note:
    The content showed in this document is only suitable for Tencent Cloud TCHouse-D v1.1 and below versions. For later versions, it is recommended to use the Multi-Catalog feature to access external data directories.
    ODBC External Table of Doris offers Doris a standard interface (ODBC) to access external tables through database, which saves the tedious work of data import. It enables Doris to access various databases and helps solve data analysis problems of external tables with Doris's own OLAP capabilities:
    Supports various data sources to access Doris.
    Supports joint query between Doris and tables in various data sources for more complex analysis operations.
    Uses insert into to write the query results of Doris to the external data source.
    This document mainly introduces the implementation principles and usage methods of this feature.

    Usage

    Creating ODBC external tables in Doris

    For specific table creation syntax, see: CREATE TABLE.

    Creating ODBC external tables without using resources.

    CREATE EXTERNAL TABLE `baseall_oracle` (
    `k1` decimal(9, 3) NOT NULL COMMENT "",
    `k2` char(10) NOT NULL COMMENT "",
    `k3` datetime NOT NULL COMMENT "",
    `k5` varchar(20) NOT NULL COMMENT "",
    `k6` double NOT NULL COMMENT ""
    ) ENGINE=ODBC
    COMMENT "ODBC"
    PROPERTIES (
    "host" = "192.168.0.1",
    "port" = "8086",
    "user" = "test",
    "password" = "test",
    "database" = "test",
    "table" = "baseall",
    "driver" = "Oracle 19 ODBC driver",
    "odbc_type" = "oracle"
    );

    Creating ODBC external tables through ODBC_Resources (recommended)

    CREATE EXTERNAL RESOURCE `oracle_odbc`
    PROPERTIES (
    "type" = "odbc_catalog",
    "host" = "192.168.0.1",
    "port" = "8086",
    "user" = "test",
    "password" = "test",
    "database" = "test",
    "odbc_type" = "oracle",
    "driver" = "Oracle 19 ODBC driver"
    );
    
    CREATE EXTERNAL TABLE `baseall_oracle` (
    `k1` decimal(9, 3) NOT NULL COMMENT "",
    `k2` char(10) NOT NULL COMMENT "",
    `k3` datetime NOT NULL COMMENT "",
    `k5` varchar(20) NOT NULL COMMENT "",
    `k6` double NOT NULL COMMENT ""
    ) ENGINE=ODBC
    COMMENT "ODBC"
    PROPERTIES (
    "odbc_catalog_resource" = "oracle_odbc",
    "database" = "test",
    "table" = "baseall"
    );
    Parameter description:
    Parameter
    Description
    hosts
    IP address of the external table database.
    driver
    Driver name of the ODBC external table. The name must be consistent with the Driver name in be/conf/odbcinst.ini.
    odbc_type
    Type of the external table database, currently supports Oracle, MySQL, and PostgreSQL.
    user
    Username of the external table database.
    password
    Cipher information for the application
    charset
    Character set used for the database connection.
    Note
    In addition to the above-mentioned parameters, PROPERTIES also supports the special parameters implemented by the ODBC Driver of each database, such as MySQL's sslverify, etc.

    Installation and Configuration of ODBC Driver

    Most mainstream databases provide ODBC access drivers. Users can install corresponding ODBC Driver lib libraries according to the officially recommended method of each database. After installation, search for the Driver Lib library path of the corresponding database, and modify the configuration of be/conf/odbcinst.ini:
    [MySQL Driver]
    Description = ODBC for MySQL
    Driver = /usr/lib64/libmyodbc8w.so
    FileUsage = 1
    The above configuration in [] corresponds to the Driver name. When establishing an external table, you must ensure that the Driver name of the external table is consistent with that in the configuration file.
    Driver= should be filled in according to the actual installation path of Driver on the BE node, which is essentially a path of dynamic library. Here, the pre-dependency of this dynamic library must be met.
    Note
    Remember, the same Driver must be installed on all the BE nodes, in the same installation path, and with the same be/conf/odbcinst.ini configuration.

    Query Usage

    After creating the ODBC external table in Doris, except that the ODBC external tabel cannot use the data model (rollup, pre-aggregation, materialized view, etc.) in Doris, it has no difference with the ordinary Doris tables.
    select * from oracle_table where k1 > 1000 and k3 ='term' or k4 like '%doris';

    Writing Data

    After creating the ODBC external table in Doris, you can directly write data through the insert into statement, and write the results of Doris query into the ODBC external table, or you can import data from one ODBC external table to another ODBC external table.
    insert into oracle_table values(1, "doris");
    insert into oracle_table select * from postgre_table;

    Transaction

    The data of Doris is written to the external table in batches If the import is interrupted in the middle, the previously written data may need to be rolled back. Therefore, the ODBC external table supports data write-in transaction, which is set by the session variable:enable_odbc_transcation.
    set enable_odbc_transcation = true;
    The transaction ensures the atomicity of data writing to the ODBC external table, but it will lower the performance of data writing to a certain extent. You can enable this feature as required.

    Database ODBC Version Correspondence

    CentOS Operating System

    The unixODBC version used is: 2.3.1, Doris 0.15, CentOS 7.9, all installed using yum.

    Mysql

    MySQL Version
    MySQL ODBC Version
    8.0.27
    8.0.27, 8.026
    5.7.36
    5.3.11, 5.3.13
    5.6.51
    5.3.11, 5.3.13
    5.5.62
    5.3.11, 5.3.13

    PostgreSQL

    PostgreSQL yum source rpm package address:
    https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
    This includes all versions of PostgreSQL from 9.x to 14.x, including the corresponding ODBC versions, which you can choose to install as required.
    PostgreSQL Version
    PostgreSQL ODBC Version
    12.9
    postgresql12-odbc-13.02.0000
    13.5
    postgresql13-odbc-13.02.0000
    14.1
    postgresql14-odbc-13.02.0000
    9.6.24
    postgresql96-odbc-13.02.0000
    10.6
    postgresql10-odbc-13.02.0000
    11.6
    postgresql11-odbc-13.02.0000

    Oracle

    Oracle Version
    Oracle ODBC Version
    Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
    oracle-instantclient19.13-odbc-19.13.0.0.0
    Oracle Database 12c Standard Edition Release 12.2.0.1.0 - 64bit Production
    oracle-instantclient19.13-odbc-19.13.0.0.0
    Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
    oracle-instantclient19.13-odbc-19.13.0.0.0
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    oracle-instantclient19.13-odbc-19.13.0.0.0
    Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
    oracle-instantclient19.13-odbc-19.13.0.0.0
    Oracle ODBC driver version download address:
    https://download.oracle.com/otn_software/linux/instantclient/1913000/oracle-instantclient19.13-sqlplus-19.13.0.0.0-2.x86_64.rpm
    https://download.oracle.com/otn_software/linux/instantclient/1913000/oracle-instantclient19.13-devel-19.13.0.0.0-2.x86_64.rpm
    https://download.oracle.com/otn_software/linux/instantclient/1913000/oracle-instantclient19.13-odbc-19.13.0.0.0-2.x86_64.rpm
    https://download.oracle.com/otn_software/linux/instantclient/1913000/oracle-instantclient19.13-basic-19.13.0.0.0-2.x86_64.rpm

    Ubuntu Operating System

    The unixODBC version used is: 2.3.4, Doris 0.15, Ubuntu 20.04.

    Mysql

    MySQL Version
    MySQL ODBC Version
    8.0.27
    8.0.11, 5.3.13
    So far, only this version has been tested, and other versions will be supplemented after testing.

    PostgreSQL

    PostgreSQL Version
    PostgreSQL ODBC Version
    12.9
    psqlodbc-12.02.0000
    For other versions, you only need to download the ODBC driver version that matches the large version of the database. In this part, we will continue to supplement the test results of other versions in the Ubuntu system.

    Oracle

    The same as the Oracle database and ODBC correspondence in the Centos operating system, use the following methos to install rpm packages in ubuntu. In order to install rpm packages in ubuntu, we also need to install an alien, which is a tool that can convert rpm packages to deb installation packages.
    sudo apt-get install alien
    Then execute the installation of the above four packages:
    sudo alien -i oracle-instantclient19.13-basic-19.13.0.0.0-2.x86_64.rpm
    sudo alien -i oracle-instantclient19.13-devel-19.13.0.0.0-2.x86_64.rpm
    sudo alien -i oracle-instantclient19.13-odbc-19.13.0.0.0-2.x86_64.rpm
    sudo alien -i oracle-instantclient19.13-sqlplus-19.13.0.0.0-2.x86_64.rpm

    Type Matching

    There are different data types among various databases.The matching of the data types of each database and the data types in Doris is listed as follows.

    MySQL

    MySQL
    Doris
    Replacement Scheme
    BOOLEAN
    BOOLEAN
    -
    CHAR
    CHAR
    Only UTF8 encoding is currently supported.
    VARCHAR
    VARCHAR
    Only UTF8 encoding is currently supported.
    DATE
    DATE
    -
    FLOAT
    FLOAT
    -
    TINYINT
    TINYINT
    -
    SMALLINT
    SMALLINT
    -
    INT
    INT
    -
    BIGINT
    BIGINT
    -
    DOUBLE
    DOUBLE
    -
    DATETIME
    DATETIME
    -
    DECIMAL
    DECIMAL
    -

    PostgreSQL

    PostgreSQL
    Doris
    Replacement Scheme
    BOOLEAN
    BOOLEAN
    -
    CHAR
    CHAR
    Only UTF8 encoding is currently supported.
    VARCHAR
    VARCHAR
    Only UTF8 encoding is currently supported.
    DATE
    DATE
    -
    REAL
    FLOAT
    -
    SMALLINT
    SMALLINT
    -
    INT
    INT
    -
    BIGINT
    BIGINT
    -
    DOUBLE
    DOUBLE
    -
    TIMESTAMP
    DATETIME
    -
    DECIMAL
    DECIMAL
    -

    Oracle

    Oracle
    Doris
    Replacement Scheme
    Not supported
    BOOLEAN
    Oracle can use number(1) in replace of boolean.
    CHAR
    CHAR
    -
    VARCHAR
    VARCHAR
    -
    DATE
    DATE
    -
    FLOAT
    FLOAT
    -
    No
    TINYINT
    Oracle can be replaced by NUMBER
    SMALLINT
    SMALLINT
    -
    INT
    INT
    -
    No
    BIGINT
    Oracle can be replaced by NUMBER
    No
    DOUBLE
    Oracle can be replaced by NUMBER
    DATETIME
    DATETIME
    -
    NUMBER
    DECIMAL
    -

    SQLServer

    SQLServer
    Doris
    Replacement Scheme
    BOOLEAN
    BOOLEAN
    -
    CHAR
    CHAR
    Only UTF8 encoding is currently supported.
    VARCHAR
    VARCHAR
    Only UTF8 encoding is currently supported.
    DATE
    DATE
    -
    REAL
    FLOAT
    -
    TINYINT
    TINYINT
    -
    SMALLINT
    SMALLINT
    -
    INT
    INT
    -
    BIGINT
    BIGINT
    -
    FLOAT
    DOUBLE
    -
    DATETIME/DATETIME2
    DATETIME
    -
    DECIMAL/NUMERIC
    DECIMAL
    -

    Best Practice

    Suitable for synchronization of small amounts of data. For example, to synchronize a Mysql table with a million pieces of data to Doris, you can use the ODBC to map the data and use insert into to synchronize the data to Doris. If you want to synchronize a large volume of data, you can use insert into in batches for synchronization (not recommended).

    FAQs

    1. Relation to the original MySQL external table. After accessing the ODBC external table, the original way of accessing the MySQL external table will be gradually phased out. If you have not used the MySQL external table before, it is recommended to directly use the ODBC MySQL external table for newly accessed MySQL tables.
    2. Whether more databases can be supported besides MySQL, Oracle, PostgreSQL, and SQLServer? Currently, Doris has only adapted to MySQL, Oracle, PostgreSQL, and SQLServer. The adaptation to other databases is being planned. In principle, any database that supports ODBC access can be accessed through an ODBC external table. If you want to access other external tables, you are welcome to modify the code and contribute to Doris.
    3. When is it appropriate to access through external tables? Usually, when the data amount of external table is smal and less than 1M, it can be accessed through an external table. Since the external table cannot leverage Doris' capabilities in the storage engine part and will bring additional network cost, it is recommended to decide whether to access through an external table or import the data into Doris based on the actual access latency requirements for query.
    4. How can I solve the problem of garbled text when accessing through Oracle? Add the following parameter in the BE startup script: export NLS_LANG=AMERICAN_AMERICA.AL32UTF8, and restart all BEs.
    5. ANSI Driver or Unicode Driver? The current ODBC supports both ANSI and Unicode Drivers, but Doris only supports the Unicode Driver. Forcing the use of the ANSI Driver may cause errors in query results.
    6. Error driver connect Err: 01000 [unixODBC][Driver Manager]Can't open lib 'Xxx' : file not found (0). The corresponding data Driver is not properly installed on each BE, or the incorrect path is configured in be/conf/odbcinst.ini, or the Driver name is different from be/conf/odbcinst.ini when the table is created.
    7. Error Fail to convert odbc value 'PALO ' TO INT on column:'A'? The column A type conversion of the ODBC external table fails, which indicates the data type of the actual column of the external table differs from that of the mapped column of ODBC. The mapping of the column must be modified.
    8. Why program crashes when the old MySQL table and ODBC external table Drivers are used at the same time? This is a compatibility issue between the My SQL Database Driver and the existing dependency MySQL external table during Doris deployment. We recommend that you solve the problem as follows:
    Method 1: Replace the old MySQL external table with an ODBC external table, recompile the BE, and disable the WITH_MYSQL.
    Method 2: Do not use the latest 8.X MySQL ODBC Driver, but the 5.X MySQL ODBC Driver instead.
    9. Does OBBC external table support filter condition pushdown? The current ODBC external table supports filter conditions pushdown. At present, the MySQL external table can support all condition pushdowns. The functions of other databases different from Doris may cause the pushdown query to fail. Besides MySQL external table, currently other databases do not support function call condition pushdown. You can use the explain query statement to confirm whether Doris push down the required filtering condition.
    10. Why does this error driver connect Err: xxx occur? This is usually because of a failure to connect to the database. Err section indicates the connection errors of different databases. It is usually because error occurs in the configuration. You can check whether an incorrect IP address, port, or password have been configured.
    11. Why garbled characters appear when reading and writing emoji expressions in the mysql external table? When establishing an odbc external table connection, Doris adopts the utf8 encoding by default. However, because the default utf8 encoding in mysql is utf8mb3, it cannot represent the emoji expression that requires 4-byte encoding. Here you must configure charset=utf8mb4 to create a mysql external table, so as to read and write emoji expressions normally.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support