Data Ingestion into ES

Elasticsearch Service

User Guide

Release Notes and Announcements

Release Notes

Product Announcements

ES API Authentication Upgrade Notice

Security Announcement

Notice for CVE-2021-22145 Vulnerability

Product Introduction

Overview

Features

Performance

Overview

4-Core 16 GB 3-Node Cluster Performance Test

8-Core 32 GB 3-Node Cluster Performance Test

Stress Test Result Comparison Between 4-Core 16 GB 3-Node Cluster and 8-Core 32 GB 3-Node Cluster

Elastic Stack (X-Pack)

Strengths

Scenarios

Capabilities and Restrictions

Related Concepts

Purchase Guide

Billing Overview

Pricing

Elasticsearch Service Serverless Pricing

Notes on Arrears

ES Kernel Enhancement

Kernel Release Notes

Targeted Routing Optimization

Compression Algorithm Optimization

FST Off-Heap Memory Optimization

Getting Started

Evaluation of Cluster Specification and Capacity Configuration

Creating Clusters

Accessing Clusters

Accessing Clusters from Client

Accessing Cluster from API

Accessing Clusters from Kibana

ES Serverless Guide

Service Overview

Basic Concepts

5-Minute Quick Experience

Quick Start

Creating Indexes

CVM Log Access

TKE Log access

Elastic MapReduce log access

TCHouse-D Cluster Log Access

Customizing Filebeat Data Access

Access Control

Writing Data

Data Query

Index Management

Configuration Management

Alarm Management

ES API References

Related Issues

Kibana Usage Issues

Third-Party Cookie Settings

Field Type Conversion Through Reindex

Data Application Guide

Data Application Overview

Data Management

Autonomous Index Overview

Creating Autonomous Index

Index Search and Analysis

Basic Index Information

Index Monitoring

Index Configuration Management

Elasticsearch Guide

Managing Clusters

Cluster Status

Restarting Clusters

Terminating Clusters

Advanced Configuration

Access Control

CAM-Based Access Control Configuration

ES Cluster

LDAP Authentication

Multi-AZ Cluster Deployment

Cluster Scaling

Adjusting Configuration

Suggestions and Principles for Cluster Specification Adjustment

Cluster Configuration

Synonym Configuration

YML File Configuration

Scenario-based Cluster Template Configuration

Plugin Configuration

Monitoring and Alarming

Viewing Monitoring Information

Configuring Alarms

Suggestions for Configuring Monitors and Alarms

Log Query

Querying Cluster Logs

Data Backup

Automatic Snapshot Backup

Using COS for Backup and Restoration

Upgrade

ES Version Upgrade Check

Upgrading ES Clusters

Practical Tutorial

Data Migration and Sync

Migrate Data

Data Ingestion into ES

Syncing MySQL Data to ES in Real Time

Use Case Construction

Building a Log Analysis System

Index Configuration

Default Index Template Description and Adjustment

Managing Indices with Curator

Hot/Warm Architecture and Index Lifecycle Management

SQL Support

Receiving Watcher Alerts via WeCom Bot

API Documentation

FAQs

Product

ES Cluster

Cluster Exceptions

Overview

Exceptional Cluster Health Status (Red and Yellow)

Cluster Circuit Breaking

Bulk Rejection/Search Rejection

High Cluster CPU Utilization

High Cluster Disk Utilization and read_only Status

Uneven Cluster Load

Service Level Agreement

Glossary

New Version Introduction

Elasticsearch Service July 2020 Release

Elasticsearch Service February 2020 Release

Elasticsearch Service December 2019 Release

DocumentationElasticsearch ServicePractical TutorialData Migration and SyncData Ingestion into ES

Data Ingestion into ES

Download PDF

Last updated: 2024-12-02 10:02:26

Data Ingestion into ES

Last updated: 2024-12-02 10:02:26

Download PDF

ES allows access to your cluster through private VIP within your VPC. You can write code to access your cluster through the Elasticsearch REST client and import your data into the cluster. You can also ingest your data through Elasticsearch's official components such as Logstash and Beats.
This document takes the official components Logstash and Beats as examples to describe how to connect your data source of different types to ES.
Preparations
You need to create a CVM instance or a Docker cluster in the same VPC as the ES cluster, as accessing the ES cluster needs to be done within the VPC.
Using Logstash to Access ES Cluster
Accessing ES cluster from CVM
1. Install and deploy Logstash and Java 8.
wget https://artifacts.elastic.co/downloads/logstash/logstash-5.6.4.tar.gz
tar xvf logstash-5.6.4.tar.gz
yum install java-1.8.0-openjdk  java-1.8.0-openjdk-devel -y
Note:
Please note that the Logstash version should be the same as the Elasticsearch version.
2. Customize the \*.conf configuration file based on the data source type. For more information, please see Data Source Configuration File Description.
3. Run Logstash.
 nohup ./bin/logstash -f ~/*.conf 2>&1 >/dev/null &
Accessing ES cluster from Docker
Creating Docker cluster
1. Pull the official image of Logstash.
docker pull docker.elastic.co/logstash/logstash:5.6.9
2. Customize the \*.conf configuration file based on the data source type and place it in the /usr/share/logstash/pipeline/ directory which can be customized.
3. Run Logstash.
docker run --rm -it -v ~/pipeline/:/usr/share/logstash/pipeline/ docker.elastic.co/logstash/logstash:5.6.9
Using TKE
Tencent Cloud Docker clusters run on CVM instances, so you need to create a CVM cluster in the TKE Console first.
1. Log in to the TKE Console and select Cluster > Create on the left sidebar to create a cluster.
﻿
﻿
2. Select Service on the left sidebar and click Create to create a service.
﻿
﻿
3. Select the official image of Logstash.
In this example, the Logstash image provided by TencentHub image registry is used. You can also create a Logstash image by yourself.
﻿
﻿
4. Create a data volume.
Create a data volume to store the Logstash configuration file. In this example, a configuration file named logstash.conf is added to the /data/config directory on the CVM instance and mounted to the /data directory of Docker, so that the logstash.conf file can be read when the container starts.
﻿
﻿
5. Configure the execution parameters.
﻿
﻿
6. Configure the service parameters and create a service as needed.
﻿
﻿
Configuration file description
File data sources
input {
    file {
        path => "/var/log/nginx/access.log" # File path
        }
}
filter {
}
output {
  elasticsearch {
    hosts => ["http://172.16.0.89:9200"] # Private VIP address and port of the ES cluster
    index => "nginx_access-%{+YYYY.MM.dd}" # Custom index name suffixed with date. One index is generated per day
 }
}
For more information on connecting file data sources, please see File input plugin.
Kafka data sources
input{
      kafka{
        bootstrap_servers => ["172.16.16.22:9092"]
        client_id => "test"
        group_id => "test"
        auto_offset_reset => "latest" # Start consumption from the latest offset
        consumer_threads => 5
        decorate_events => true # This attribute will bring the current topic, offset, group, partition, and other information into the message
        topics => ["test1","test2"] # Array type. Multiple topics can be configured
        type => "test" # Data source identification field
      }
}
﻿
output {
  elasticsearch {
    hosts => ["http://172.16.0.89:9200"] # Private VIP address and port of the ES cluster
    index => "test_kafka"
 }
}
For more information on connection Kafka data sources, please see Kafka input plugin.
Database data sources connected with JDBC
input {
    jdbc {
      # MySQL database address
      jdbc_connection_string => "jdbc:mysql://172.16.32.14:3306/test"
      # Username and password
      jdbc_user => "root"
      jdbc_password => "Elastic123"
      # Driver jar package. You need to download the jar when installing and deploying Logstash on your own as it is not provided by Logstash by default
      jdbc_driver_library => "/usr/local/services/logstash-5.6.4/lib/mysql-connector-java-5.1.40.jar"
      # Driver class name
      jdbc_driver_class => "com.mysql.jdbc.Driver"
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      # Path and name of the SQL file to be executed
      #statement_filepath => "test.sql"
      # SQL statement to be executed
      statement => "select * from test_es"
      # Set the monitoring interval. The meanings of each field (from left to right) are minutes, hours, days, months, and years. If all of them are `*`, it indicates to update once every minute by default
      schedule => "* * * * *"
      type => "jdbc"
    }
}
﻿
output {
    elasticsearch {
        hosts => ["http://172.16.0.30:9200"]
        index => "test_mysql"
        document_id => "%{id}"
    }
}
For more information on connecting JDBC data sources, please see JDBC input plugin.
Using Beats to Access ES Cluster
Beats contains a variety of single-purpose collectors. These collectors are relatively lightweight and can be deployed and run on servers to collect data such as logs and monitoring information. Beats occupies less system resources than Logstash does.
Beats includes FileBeat for collecting file-type data, MetricBeat for collecting monitoring metric data, PacketBeat for collecting network packet data, etc. You can also develop your own Beats components based on the official libbeat library as needed.
Accessing ES cluster from CVM
1. Install and deploy Filebeat.
 wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.6.4-linux-x86_64.tar.gz
 tar xvf filebeat-5.6.4.tar.gz
2. Configure filebeat.yml.
3. Run Filebeat.
 nohup ./filebeat 2>&1 >/dev/null &
Accessing ES cluster from Docker
Creating Docker cluster
1. Pull the official image of Filebeat.
 docker pull docker.elastic.co/beats/filebeat:5.6.9
2. Customize the \*.conf configuration file based on the data source type and place it in the /usr/share/logstash/pipeline/ directory which can be customized.
3. Run Filebeat.
 docker run docker.elastic.co/beats/filebeat:5.6.9
Using TKE
The deployment method of Filebeat through TKE is similar to that of Logstash, and you can use the Filebeat image provided by Tencent Cloud.
Configuration file description
Configure the filebeat.yml file as follows:
// Input source configuration
filebeat.prospectors:
- input_type: log
    paths:
    - /usr/local/services/testlogs/*.log
﻿
// Output to ES
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["172.16.0.39:9200"]

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha