Consumption of CLS Log with Flink

Cloud Log Service

Release Notes and Announcements

Release Notes

Announcements

Upgrading Existing LogListener Instances

Product Introduction

Limits

Log Collection

Shipping and Consumption

Concepts

Purchase Guide

Pay-as-You-Go

Pay-As-You-Go Examples

Billing

Cleaning up CLS resources

Cost Optimization

FAQs

Getting Started

Getting Started Guide

Quickly Trying out CLS with Demo

Operation Guide

Resource Management

Logset

Log Topic

Metric topic

Topic Usage Monitoring

Machine Group Management

Permission Management

Sub-Account Authorization

Authorizable Resource Types

Access Policy Templates

Log Collection

Collection Overview

Collection by LogListener

LogListener Use Process

LogListener Installation and Deployment

LogListener Installation Guide(Linux)

Batch Deploying LogListener in CVM and Lighthouse

Installing LogListener in Self-built Kubernetes Cluster

LogListener Upgrade Guide

LogListener Service Logs

Importing LogListener Collection Configuration

LogListener Updates

Collecting Text Log

Full Text in a Single Line

Full Text in Multi Lines

Full Regular Expression (Single-Line)

Full Regular Expression (Multi-Line)

JSON Format

Separator Format

Combined Parsing Format

Configuring the Time Format

Collecting Logs in Self-Built Kubernetes Cluster

Configuring Log Collection in Self-Built Kubernetes Cluster in the Console

Configuring Log Collection in Self-Built Kubernetes Cluster via CRD

Collecting Syslog

Uploading Log over Kafka

Uploading Logs via Anonymous Write

Uploading Logs via Logback Appender

Uploading Logs via Log4j Appender

Uploading Log via SDK

Uploading Log via API

Importing Data

Importing COS Data

Tencent Cloud Service Log Access

Metric Collection

Metrics Reporting

Log Storage

Storage Class Overview

Data Encryption

Metric Storage

Metric Storage Overview

Compatible with Prometheus API

Metric Pre-aggregation

Search and Analysis (Log Topic)

Search and Analysis (Metric Topic)

Syntax Rules

Dashboard

Data Processing documents

Shipping and Consumption

Shipping Overview

CLS Service Role Authorization

Shipping to COS

JSON Shipping

Raw Log Shipping

Shipping Task Management

Shipping to CKafka

Creating Shipping Task

Shipping to ES

Dumping to ES Through SCF

Log Shipping

Consumption over Kafka

Customized Consumption

Metric Topic Shipping

Monitoring Alarm

Monitoring Alarm Overview

Managing Alarm Policies

Configuring Alarm Policies

Trigger Condition Expression

Channels to Receive Alarm Notifications

Receiving Alarm Notifications via SMS and Phone

Custom Callback APIs

Managing Notification Groups

Alarm Notification Variable

Viewing Alarm Records

Alarm Silence

Historical Documentation

Operation guide of earlier LogListener versions

Troubleshooting earlier LogListener versions

Practical Tutorial

Log Collection

Collecting Windows Logs to CLS

Collect/Query Host File Logs

Search and Analysis

CLB Access Log Analysis

NGINX Access Log Analysis

CDN Access Log Analysis

COS Access Log Analysis

CCN Flow Log Analysis

TKE Event Log Analysis

TKE Audit Log Analysis

Dashboard

Migrating the ES data source of Grafana to the CLS data source

Monitoring Alarm

Setting Alarm Trigger Conditions by Time Period

Setting Interval-Valued Comparison and Periodically-Valued Comparison as Alarm Trigger Conditions

Shipping and Consumption

Consumption of CLS Log with Flink

Using DLC (Hive) to Analyze CLS Log

Developer Guide

Embedding CLS Console

CLS Connection to Grafana

API Documentation

Making API Requests

Topic Management APIs

Log Set Management APIs

Index APIs

Topic Partition APIs

Machine Group APIs

DescribeMachineGroups

DescribeMachines

AddMachineGroupInfo

DeleteMachineGroupInfo

Collection Configuration APIs

ApplyConfigToMachineGroup

DeleteConfigFromMachineGroup

DescribeConfigMachineGroups

DescribeMachineGroupConfigs

Log APIs

Metric APIs

Alarm Policy APIs

DescribeAlertRecordHistory

Data Processing APIs

DescribeDataTransformInfo

ModifyDataTransform

Kafka Protocol Consumption APIs

CloseKafkaConsumer

ModifyKafkaConsumer

OpenKafkaConsumer

DescribeKafkaConsumer

CKafka Shipping Task APIs

Kafka Data Subscription APIs

DescribeKafkaRecharges

CheckRechargeKafkaServer

COS Shipping Task APIs

SCF Delivery Task APIs

CreateDeliverCloudFunction

Scheduled SQL Analysis APIs

DeleteScheduledSql

DescribeScheduledSqlInfo

CreateScheduledSql

ModifyScheduledSql

COS Data Import Task APIs

SearchCosRechargeInfo

FAQs

Health Check

Index Configuration

Log Upload

Collection

Machine Group Exception

LogListener FAQs

LogListener Installation Exception

Container Log Collection

Self-Built Kubernetes Log Collection Troubleshooting Guide

Log Search

Log Search Failure

Search Analysis Error

Others

CLS Service Level Agreement

CLS Policy

Data Processing And Security Agreement

Glossary

DocumentationCloud Log ServicePractical TutorialShipping and ConsumptionConsumption of CLS Log with Flink

Consumption of CLS Log with Flink

Download PDF

Last updated: 2024-01-20 17:28:40

Consumption of CLS Log with Flink

Last updated: 2024-01-20 17:28:40

Download PDF

This document describes how to consume CLS logs with Flink in real time, analyze Nginx log data with Flink SQL, calculate web PVs/UVs, and write the result to self-built MySQL databases in real time.
Components/Applications and their versions used in this document are as described below:
Technical Component
Version
Nginx
1.22
CLS
-
Java
OpenJDK version "1.8.0_232"
Scala
2.11.12
Flink SQL
Flink 1.14.5
MySQL
5.7
Directions
Step 1. Install the Tencent Cloud Nginx gateway
1. Purchase a CVM instance as instructed in Creating Instances via CVM Purchase Page.
2. Install Nginx as instructed in the directions for installing Nginx on Linux.
3. The installation is successful if you can access Nginx in the browser and see the following page:
﻿
﻿
Step 2. Collect Nginx logs to CLS
1. Configure Nginx log collection as instructed in Collecting and Searching NGINX Access Logs.
2. Install CLS's log collector LogListener as instructed in LogListener Installation Guide. Similar to the open-source component Beats, LogListener is an agent that collects logs.
3. After index is enabled for log topics, you can query Nginx logs as shown below:
4. Enable consumption over Kafka in the CLS console to use the feature. You can consume a log topic as a Kafka topic. This document describes how to consume Nginx log data in real time with the stream computing framework Flink and then write the real-time computing result to MySQL.
Step 3. Set up a MySQL database
For detailed directions, see Creating MySQL Instance.
1. Log in to the database:
mysql -h 172.16.1.1 -uroot
2. Create the target database and table. Here, the flink_nginx database and mysql_dest table are created.
create database if not exists flink_nginx;
create table if not exists mysql_dest(
 ts timestamp,
 pv bigint,
 uv bigint
);
Step 4. Deploy Flink
1. We recommend that you use the following versions for Flink deployment; otherwise, the installation may fail.
Purchase a CVM instance as instructed in Creating Instances via CVM Purchase Page.
Install Scala 2.11.12 as instructed in Ways to Install This Release.
2. Install Flink 1.14.15 and go to the SQL UI. Download the binary code package of Flink from the Apache Flink website and start installation.
# Decompress the Flink binary package
tar -xf flink-1.14.5-bin-scala_2.11.tgz
cd flink-1.14.5
﻿
# Download Kafka dependencies
wget https://repo1.maven.org/maven2/org/apache/flink/flink-connector-kafka_2.11/1.14.5/flink-connector-kafka_2.11-1.14.5.jar
mv flink-connector-kafka_2.11-1.14.5.jar lib
wget https://repo1.maven.org/maven2/org/apache/kafka/kafka-clients/2.4.1/kafka-clients-2.4.1.jar
mv kafka-clients-2.4.1.jar lib
﻿
# Download MySQL dependencies
wget https://repo1.maven.org/maven2/org/apache/flink/flink-connector-jdbc_2.11/1.14.5/flink-connector-jdbc_2.11-1.14.5.jar
mv flink-connector-jdbc_2.11-1.14.5.jar lib
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.11/mysql-connector-java-8.0.11.jar
mv mysql-connector-java-8.0.11.jar lib
wget https://repo1.maven.org/maven2/org/apache/flink/flink-table-common/1.14.5/flink-table-common-1.14.5.jar
mv flink-table-common-1.14.5.jar lib
﻿
# Start Flink
bin/start-cluster.sh
bin/sql-client.sh
3. The installation is successful if the following information is displayed. Note that the default web port is 8081.
﻿
﻿
﻿
﻿
﻿
﻿
Step 5. Consume CLS log data with Flink
1. On the SQL client UI, execute the following SQL statements:
-- Create a data source table to consume Kafka data
CREATE TABLE `nginx_source`
(
    `remote_user` STRING,           -- Field in the log, which indicates the client name.
    `time_local` STRING,            -- Field in the log, which indicates the local time of the server.
    `body_bytes_sent` BIGINT,       -- Field in the log, which indicates the number of bytes sent to the client.
    `http_x_forwarded_for` STRING,  -- Field in the log, which records the actual client IP when there is a proxy server on the frontend.
    `remote_addr` STRING,           -- Field in the log, which indicates the client IP.
    `protocol` STRING,              -- Field in the log, which indicates the protocol type.
    `status` INT,                   -- Field in the log, which indicates the HTTP request status code.
    `url` STRING,                   -- Field in the log, which indicates the URL.
    `http_referer` STRING,          -- Field in the log, which indicates the URL of the referer.
    `http_user_agent` STRING,       -- Field in the log, which indicates the client browser information.
    `method` STRING,                -- Field in the log, which indicates the HTTP request method.
    `partition_id` BIGINT METADATA FROM 'partition' VIRTUAL,    -- Kafka partition
    `ts` AS PROCTIME()                
)  WITH (
  'connector' = 'kafka',
  'topic' = 'YourTopic',  -- Topic name provided in the CLS console for consumption over Kafka, such as `out-633a268c-XXXX-4a4c-XXXX-7a9a1a7baXXXX` 
  'properties.bootstrap.servers' = 'kafkaconsumer-ap-guangzhou.cls.tencentcs.com:9096',   -- Service address provided in the CLS console for consumption over Kafka. The public network consumer address in Guangzhou region is used as an example. You need to enter the actual information.
  'properties.group.id' = 'kafka_flink', -- Kafka consumer group name
  'scan.startup.mode' = 'earliest-offset', 
  'format' = 'json',
  'json.fail-on-missing-field' = 'false', 
  'json.ignore-parse-errors' = 'true' ,
  'properties.sasl.jaas.config' = 'org.apache.kafka.common.security.plain.PlainLoginModule required username="your username" password="your password";',--Your username is the logset ID of the log topic, such as `ca5cXXXX-dd2e-4ac0-af12-92d4b677d2c6`, and the password is a string of your `secretid#secrectkey`, such as `AKIDWrwkHYYHjvqhz1mHVS8YhXXXX#XXXXuXtymIXT0Lac`. Note that `#` is required. We recommend that you use a sub-account key and follow the principle of least privilege when authorizing a sub-account, that is, configure the minimum permission for `action` and `resource` in the access policy of the sub-account.
  'properties.security.protocol' = 'SASL_PLAINTEXT',
  'properties.sasl.mechanism' = 'PLAIN'
﻿
);
﻿
--- Create the target table and write it to MySQL
CREATE TABLE `mysql_dest`
(
    `ts` TIMESTAMP,  
    `pv` BIGINT, 
    `uv` BIGINT 
)  WITH (
    'connector' = 'jdbc',
    'url' = 'jdbc:mysql://11.150.2.1:3306/flink_nginx?&amp;serverTimezone=Asia/Shanghai', -- Note the time zone settings here
    'username'= 'username',     -- MySQL account
    'password'= 'password',     -- MySQL password
    'table-name' = 'mysql_dest' -- MySQL table name
);
﻿
--- Query the Kafka data source table and write the computing result to the MySQL target table
INSERT INTO mysql_dest (ts,uv,pv)
SELECT TUMBLE_START(ts, INTERVAL '1' MINUTE) start_ts, COUNT(DISTINCT remote_addr) uv,count(*) pv
FROM nginx_source
GROUP BY TUMBLE(ts, INTERVAL '1' MINUTE);
2. On the Flink task monitoring page, view the monitoring data of the task:
﻿
﻿
3. Go to the MySQL database, and you can see that PV and UV results are calculated and written in real time:
﻿
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

Technical Component	Version
Nginx	1.22
CLS	-
Java	OpenJDK version "1.8.0_232"
Scala	2.11.12
Flink SQL	Flink 1.14.5
MySQL	5.7

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha