tencent cloud

Feedback

Synchronizing Data with JDBC

Last updated: 2024-06-27 11:03:50
    You can import data using the INSERT statement via the JDBC protocol. The usage of the INSERT statement is similar to that in MySQL and other databases. The INSERT statement supports the following two syntaxes:
    * INSERT INTO table SELECT ...
    * INSERT INTO table VALUES(...)
    Here we only introduce the second method. For a detailed description of the INSERT command, please see the INSERT command document.

    Single Write

    Single write means that the user directly executes an INSERT command. Example as follows:
    INSERT INTO example_tbl (col1, col2, col3) VALUES (1000, "test", 3.25);
    For Doris, an INSERT command is a complete import transaction. Therefore, whether it's importing one piece of data or multiple pieces of data, we do not recommend using this method to import data in a production environment. Frequent INSERT operations will generate a large number of small files in the storage layer, severely affecting system performance.
    This method is only used for simple offline testing or low-frequency, small-scale operations. Or you can use the following method for batch insertion:
    INSERT INTO example_tbl VALUES
    (1000, "baidu1", 3.25)
    (2000, "baidu2", 4.25)
    (3000, "baidu3", 5.25);
    We recommend that the number of inserts in each batch be as large as possible, such as thousands or even 10,000 at a time. Or you can use PreparedStatement via the method below to perform batch insertion.

    JDBC Example

    Here we provide a simple example of JDBC batch INSERT code:
    package demo.doris;
    
    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.PreparedStatement;
    import java.sql.SQLException;
    
    public class DorisJDBCDemo {
    
    private static final String JDBC_DRIVER = "com.mysql.jdbc.Driver";
    private static final String DB_URL_PATTERN = "jdbc:mysql://%s:%d/%s?rewriteBatchedStatements=true";
    private static final String HOST = "127.0.0.1"; // Leader Node host
    private static final int PORT = 9030; // query_port of Leader Node
    private static final String DB = "demo";
    private static final String TBL = "test_1";
    private static final String USER = "admin";
    private static final String PASSWD = "my_pass";
    
    private static final int INSERT_BATCH_SIZE = 10000;
    
    public static void main(String[] args) {
    insert();
    }
    
    private static void insert() {
    // make sure there are no semicolons (";") at the end
    String query = "insert into " + TBL + " values(?, ?)";
    // Setting a Label to achieve idempotence.
    // String query = "insert into " + TBL + " WITH LABEL my_label values(?, ?)";
    
    Connection conn = null;
    PreparedStatement stmt = null;
    String dbUrl = String.format(DB_URL_PATTERN, HOST, PORT, DB);
    try {
    Class.forName(JDBC_DRIVER);
    conn = DriverManager.getConnection(dbUrl, USER, PASSWD);
    stmt = conn.prepareStatement(query);
    
    for (int i =0; i < INSERT_BATCH_SIZE; i++) {
    stmt.setInt(1, i);
    stmt.setInt(2, i * 100);
    stmt.addBatch();
    }
    
    int[] res = stmt.executeBatch();
    System.out.println(res);
    } catch (Exception e) {
    e.printStackTrace();
    } finally {
    try {
    if (stmt != null) {
    stmt.close();
    }
    } catch (SQLException se2) {
    se2.printStackTrace();
    }
    try {
    if (conn != null) conn.close();
    } catch (SQLException se) {
    se.printStackTrace();
    }
    }
    }
    }
    Please note the following points:
    1. You need to add rewriteBatchedStatements=true parameter to the JDBC connection string and use the PreparedStatement method. Currently, Doris does not support server-side PrepareStatemnt, so the JDBC Driver will batch prepare on the client side. rewriteBatchedStatements=true ensures that the Driver performs batch processing and eventually sends the following INSERT statement to Doris:
    INSERT INTO example_tbl VALUES
    (1000, "tencent1", 3.25)
    (2000, "tencent2", 4.25)
    (3000, "tencent3", 5.25);
    2. Batch Size Since it's done in batches on the client side, if a batch is too large, it will use more memory resources on the client side, so be sure to pay attention.
    Note
    Doris will support server-side PrepareStatemnt in the future, stay tuned.
    3. Import Atomicity Just like other import methods, the INSERT operation itself also supports atomicity. Each INSERT operation is an import transaction, ensuring all data within an INSERT are written atomically. As mentioned earlier, we recommend importing data in "batches" when using INSERT, rather than single insertions. At the same time, we can set a Label for each INSERT operation. Through the label mechanisim , the operation's Idempotence and atomicity can be guaranteed, and the data will not be lost or heavy in the end. For specific usage of Labels in INSERT, please see the INSERT document.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support