Serverless Cloud Function (SCF) now supports Apache Kafka as the event trigger source, enabling batch consumption and processing of Kafka messages.
Apache Kafka Overview
Apache Kafka is an open-source event streaming platform designed for workloads such as data pipelines and stream processing. SCF enables integration of business functions with self-built Apache Kafka clusters. Supported Kafka clusters include cross-region CKafka clusters, Kafka clusters hosted on other cloud providers, or Kafka-like clusters managed through solutions such as Confluent Cloud, including Azure EventHub.
SCF supports event sources based on the Kafka protocol framework, enabling batch consumption. Batch processing behavior can be controlled using parameters such as the maximum number of messages per batch, maximum waiting time, and retry attempts.
Features of Self-Built Apache Kafka Triggers
Self-built Apache Kafka triggers have the following features:
Pull model: The SCF backend module acts as a consumer, connecting to the Kafka instance to consume messages. Once the backend module retrieves messages, it encapsulates them into a data structure and calls the specified function, passing the message data to the SCF.
Synchronous call: Self-built Apache Kafka triggers use the synchronous call type to call functions. For more information on call types, see Call Types. Note:
For execution errors, including user code and runtime environment errors, self-built Apache Kafka triggers will retry based on the configured retry attempts, with a default of 10,000 retries.
For system errors, self-built Apache Kafka triggers use an exponential backoff strategy to continuously retry until the operation succeeds.
Attributes of Self-Built Apache Kafka Triggers
Trigger name: It should contain 2 to 60 characters, consisting of a-z
, A-Z
, 0-9
, -
, and _
. It should start with a letter and end with a letter or number. Multiple triggers with the same name are not allowed for one function.
Bootstrap Servers: It configures the connection addresses for the self-built Apache Kafka instances to be consumed. Multiple bootstrap servers are supported in the format of either IP+port
or Domain Name+port
.
Topic: Enter the topic of the existing Apache Kafka instance.
Consumer Group: Select the consumer group of the existing Apache Kafka instance. If the specified consumer group does not exist, one will be automatically created. It is recommended to use a dedicated consumer group, separate from existing businesses, to avoid interfering with ongoing message consumption.
Security protocol: The security protocol used by the Apache Kafka instance. Currently supported protocols include PLAINTEXT
, SASL_SSL
, and SASL_PLAINTEXT
.
Identity verification mechanism: The authentication mechanism used by the Apache Kafka instance. Currently supported options include None
, PLAIN
, SCRAM-SHA-256
, and SCRAM-SHA-512
. If your instance does not require authentication, select None
.
Username and Password: If an authentication mechanism is selected, you should provide the username and password authorized to access the instance.
Maximum messages: The maximum number of messages to be pulled and delivered to SCF in a single batch, with a current maximum configuration of 10,000. Due to factors such as message size and write speed, the actual number of messages delivered during each trigger may not always reach the maximum value, but will vary between 1 and the specified maximum batch size.
Consumption start point: The starting point for message consumption by the trigger. Currently, it supports consuming messages from the latest position.
Retry attempts: The maximum number of retries when the function encounters execution errors (including user code errors and runtime errors).
Max waiting time: The longest waiting time for one trigger. For example, if the user configures the maximum batch size as 1,000 messages and the maximum waiting time as 60 seconds, the function will be triggered if 1,000 messages are collected within 10 seconds. If only 50 messages are collected after 60 seconds, the function will still be triggered.
Note:
Currently, for existing self-built Apache Kafka triggers, only the following three configuration items can be edited: Maximum messages, Retry attempts, and Max waiting time.
Self-Built Apache Kafka Consumption and Message Delivery
Since self-built Apache Kafka messages do not have push capabilities, the consumer should pull the messages for consumption. Therefore, after the self-built Apache Kafka trigger is configured, the SCF backend will activate the self-built Apache Kafka consumer module to act as the consumer. It will also create an independent consumer group within the self-built Apache Kafka for message consumption.
After the SCF backend consumer module consumes the messages, it will combine information such as the Timeout, Accumulated message size, and Maximum messages to form an event structure and initiate a function call (synchronous call). The related limitations are as follows:
Timeout: The current timeout for the SCF backend consumer module is 60 seconds, to avoid delays before consumption. For example, if there are few messages written to the topic and the consumer module does not accumulate enough messages to reach the maximum batch size within 60 seconds, it will still initiate the function call.
Event Size Limit for Synchronous Calls: 6 MB. For details, see Quota Limits. If the messages in the topic are large, for example, if a message already reaches 6 MB, due to the 6 MB limit for synchronous calls, the event structure passed to the SCF will contain only one message instead of the maximum number of messages configured by the user. Maximum Batch Size: This attribute is the same as the one in the self-built Apache Kafka trigger and is set by the user. The current maximum supported configuration is 10,000.
The SCF backend consumer module will loop through this process and ensure the order of message consumption. This means that the next batch of messages will not be consumed until the previous batch has been fully processed (synchronous call).
Note:
During this process, the number of messages in each batch may vary, meaning the number of messages in each event structure will be between 1 and the configured maximum batch size. If the configured maximum batch size is set too high, it is possible that the number of messages in the event structure will never reach the maximum batch size.
After receiving the event content in SCF, you can choose to process the messages in a loop to ensure that each message is handled. You should not assume that the number of messages passed in each event is constant.
SCF will use the standard Kafka protocol to retrieve the number of partitions for the specified topic. The backend consumer module will automatically create the same number of consumers. If the partition count cannot be obtained, 20 consumers will be created by default.
FAQs
How to Handle a Large Accumulation of Messages in a Self-Built Apache Kafka Instance?
After you configure the self-built Apache Kafka trigger, the SCF backend will activate the consumer module as the consumer, creating an independent consumer group in the self-built Apache Kafka for message consumption. The number of consumer modules will equal the number of partitions in the topic. If there is a large accumulation of messages, the consumption capacity needs to be increased. The following methods can be used to enhance consumption capacity:
Optimize the execution time of the SCF. The shorter the execution time of the SCF, the higher the consumption capacity. If the execution time increases (for example, if the SCF needs to write to a database and the database response slows down), the consumption speed will decrease.
Was this page helpful?