tencent cloud

Feedback

Importing Doris from Logstash

Last updated: 2024-06-27 11:05:22
    Logstash's Doris output plugin is needed to import Doris from Logstash. This plugin interacts with Doris FE HTTP interface using HTTP protocol, and Doris's stream load is used for data import.

    Installation and Compilation

    1. Download Source Code

    Plugin source code is in the Doris source code. Download Doris source code.

    2. Compile

    Execute in the extension/logstash/ directory of Doris source code:
    gem build logstash-output-doris.gemspec
    You will get logstash-output-doris-{version}.gem file in the same directory.

    3. Plugin Installation

    Copy logstash-output-doris-{version}.gem to the logstash installation directory, execute command:
    ./bin/logstash-plugin install logstash-output-doris-{version}.gem
    Install logstash-output-doris plugin.

    Configuration

    Sample code

    Create a new configuration file in the config directory, named logstash-doris.conf, specific configuration is as follows:
    output {
    doris {
    http_hosts => [ "http://fehost:8030" ]
    user => user_name
    password => password
    db => "db_name"
    table => "table_name"
    label_prefix => "label_prefix"
    column_separator => ","
    }
    }

    Configuration Instructions

    Connection related configuration:
    Configuration
    Description
    http_hosts
    FE's HTTP interaction address. For example: ["http://fe1:8030", "http://fe2:8030"]
    user
    Username, this user needs to have import permissions for Doris's corresponding library table
    password
    Password
    db
    Database name
    table
    Table name
    label_prefix
    Import identification prefix, the final identification is {label_prefix}_{db}_{table}_{time_stamp}
    See the Stream Load Manual for related import configuration.
    Configuration
    Description
    column_separator
    Column delimiter, default is\\t.
    columns
    Used to specify the relationship between the columns in the import file and the columns in the table.
    where
    Filter condition specified for the import task.
    max_filter_ratio
    The maximum tolerance for the import task, default is zero tolerance.
    partition
    Partition information of the table to be imported.
    timeout
    Timeout, default is 600s.
    strict_mode
    Strict mode, default is false.
    timezone
    Specify the time zone used for this import, default is Eastern Standard Time.
    exec_mem_limit
    Import memory limit, default is 2GB, units in bytes.
    Other configurations:
    Configuration
    Description
    save_on_failure
    Whether to save locally if the import fails, default is true
    save_dir
    Local save directory, default is /tmp
    automatic_retries
    The maximum number of retries when failing, default is 3
    batch_size
    The maximum number of events processed in each batch, default is 100,000
    idle_flush_time
    Maximum interval time, default is 20 (seconds)

    Startup

    Execute command to start doris output plugin:
    {logstash-home}/bin/logstash -f {logstash-home}/config/logstash-doris.conf --config.reload.automatic

    Complete Usage Example

    1. Compile doris-output-plugin

    1. Download ruby compressed package, go to ruby official websiteto download by yourself, the version used here is 2.7.1.
    2. Compile and install, configure the environment variables for Ruby.
    3. Go to the extension/logstash/ directory of doris source code, and execute:
    gem build logstash-output-doris.gemspec
    You will get the file logstash-output-doris-0.1.0.gem, until now the compilation is completed.

    2. Install and configure filebeat

    Note
    Filebeat is used as the input source here.
    1. On ES official website to download filebeat tar compressed package and decompress it.
    2. Enter the filebeat directory and modify the configuration file filebeat.yml as follows:
    filebeat.inputs:
    - type: log
    paths:
    - /tmp/doris.data
    output.logstash:
    hosts: ["localhost:5044"]
    3. Start filebeat:
    ./filebeat -e -c filebeat.yml -d "publish"

    3. Install logstash and doris-out-plugin

    1. Download the logstash tar compressed package and unpack it from the ES Official Website.
    2. Copy the logstash-output-doris-0.1.0.gem obtained in Step 1 to the logstash installation directory.
    3. Execute:
    ./bin/logstash-plugin install logstash-output-doris-0.1.0.gem
    Installed plugin.
    4. Create a new configuration file in the config directory named logstash-doris.conf. The content is as follows:
    input {
    beats {
    port => "5044"
    }
    }
    
    output {
    doris {
    http_hosts => [ "http://127.0.0.1:8030" ]
    user => doris
    password => doris
    db => "logstash_output_test"
    table => "output"
    label_prefix => "doris"
    column_separator => ","
    columns => "a,b,c,d,e"
    }
    }
    The configuration here needs to be set according to the configuration instructions.
    5. Start logstash:
    ./bin/logstash -f ./config/logstash-doris.conf --config.reload.automatic

    4. Test Features

    Add write data to /tmp/doris.data:
    echo a,b,c,d,e >> /tmp/doris.data
    Observe the logstash log. If the Status of the returned response is Success, the import is successful. At this point, you can view at the imported data in the logstash_output_test.output table.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support