tencent cloud

All product documents
Serverless Cloud Function
Concurrent High-Performance Architecture
Last updated: 2024-12-02 16:35:34
Concurrent High-Performance Architecture
Last updated: 2024-12-02 16:35:34
Concurrency refers to the number of requests that can be processed by a function concurrently at a moment. If it can be sustained by other services of your business, you can increase the function concurrency from several to tens of thousands with simple configuration.

Use Cases

High QPS and short execution duration

A function can be used for simple data or file processing; for example, it can be triggered by COS to report information or process files. In such scenarios, the execution duration of a single request is short.

Computation-Intensive long execution

A function can be used in audio/video transcoding, data processing, and AI-based interference. Due to various operations such as model loading, the function initialization/execution and Java runtime environment initialization take more time.

Async message processing

A function can be used for async message processing in diverse scenarios, such as Tencent Cloud's proprietary WYSIWYG recording and TDMQ function trigger. It can connect the data at both ends of the message queue to the greatest extent and help implement the async event decoupling and peak shifting capabilities under the serverless system.

Strengths

By using reserved quota and provisioned concurrency together, you can flexibly allocate resources among multiple functions and warm up functions as needed.

Shared quota

If nothing is configured, all functions share the account quota by default. If a function generates a surge of business invocations, it can make full use of the unused quota to ensure that the surge will not cause overrun errors.

Guaranteed concurrency

If the business features of a specific function are sensitive or critical, and you need to do your best to ensure a high request success rate, then you can use the reserved quota feature to this end. Reserved quota can give the function exclusive quota to guarantee the concurrency reliability and avoid overruns caused by concurrency preemption by multiple functions.

Provisioned concurrency

If a function is sensitive to cold start, the code initialization process takes a long time, or many libraries need to be loaded, then you can set the provisioned concurrency for a specific function version to start function instances in advance and ensure smooth execution.

How Concurrency Expansion Works

For more information on concurrent instance reuse and repossession and concurrency expansion, please see Concurrency Overview.

Samples

For example, the concurrency quota of an account in the Guangzhou region is 1,000 concurrent instances by default for a 128 MB function, and if many requests arrive, 500 concurrent instances can be started from 0 in the first minute. If there are still other requests to be processed, 500 more concurrent instances can be started to reach 1,000 instances in total in the second minute.
The following figure simulates the specific concurrent processing scenario of a function during business traffic peaks. As business requests constantly increase and there are no concurrent instances available to process new requests, the function will start new concurrent instances. When the expansion speed limit of elastic concurrency is reached, function expansion will gradually slow down, and new requests will be restricted and retried. Then, the function will continue expansion and eventually reach the account-level concurrency limit in the current region. Finally, after the business needs are satisfied, the number of requests will gradually decrease, and the unused concurrent instances of the function will gradually stop.


Provisioned concurrency can start concurrent instances in advance according to the configuration. SCF will not repossess these instances; instead, it will ensure as much as possible that a sufficient number of concurrent instances are available to process requests. You can use this feature to set the quota of provisioned concurrent instances for a specified function version, so as to prepare computing resources in advance and expedite cold start and initialization of the runtime environment and business code. The following figure simulates the actual provisioned concurrency conditions of a function when handling business traffic peaks.



Use Case-Based Stress Tests

Use case 1. High QPS and short execution duration

In this scenario, the QPS is high, the execution duration of a single request is short, and the business experiences a concurrency peak in one or two seconds after cold start. Next, you can carry out tests and observe whether gradually switching traffic or configuring provisioned concurrency can ease the cold start concurrency peak.

Business conditions

Business Information
Metric
Function initialization duration
The function doesn't require initialization
Business execution duration
5 ms
QPS
Around 100,000 (peak)

Stress test task

We plan three stress test tasks for complete cold start, gradual traffic switch, and provisioned concurrency configuration respectively. Each stress test task needs to start from cold start with no hot instances and impact of other functions present.

Stress test goal

The business with a high QPS experiences a concurrency peak in one or two seconds after cold start, and gradually switching traffic or configuring provisioned concurrency can ease the cold start concurrency peak.

Stress test configuration

Function configuration
a. Memory: 128 MB, with async execution, status tracking, and log delivery disabled
b. Concurrency quota: 4,000 * 128 MB
c. Duration: 5 ms
d. Burst: 2,000
Testing tool
Directly call the `RegionInvoke` API with the `go-wrk` tool.


Conclusion

In summary, in scenarios with high QPS and short execution duration, gradually switching traffic can ease the cold start concurrency peak, and configuring provisioned concurrency can address the problem of lengthy initialization (including cold function start).

Use case 2. Computation-intensive long execution

Business conditions

Business Information
Metric
Function initialization duration
10s
Business execution duration
2m
QPS
About 20

Stress test goal

The average QPS of the lengthy computing tasks is not high, but due to the lengthy computation process, a high number of instances are running, leading to a high function concurrency. This stress test scenario is designed to test the task scheduling and processing speeds of the function when processing a high number of lengthy execution tasks.

Stress test configuration

Function configuration
a. Memory: 128 MB, with async execution and status tracking enabled but log delivery disabled
b. Concurrency quota: 2,000 * 128 MB
c. Duration: 2m
d. Burst: 2,000
Testing tool
Use the ab tool to simulate messages in COS and invoke the function through a COS trigger.


Stress test task

Each stress test task needs to start from cold start with no hot instances and impact of other functions present.

Result analysis

The case with 4,000 messages can better demonstrate the conclusion drawn in the case of 2,000 messages. Because the previous function is not released in the first two seconds, the number of cold starts is the same as the number of function requests, and subsequently, the increase in the number of cold starts and the increase in the number of function requests are generally consistent.

Conclusion

Business involving a lengthy initialization and execution duration can run stably, and the system can scale instantly based on the number of requests.

Use case 3. Async message processing

Business background

SCF functions are widely used for async message processing. Here, the execution duration of 100 ms is used as the average value.
Business Information
Metric
Function initialization duration
0
Business execution duration
100 ms

Stress test goal

The consumption of async messages is related to production. Suppose a high number of messages have been retained. View the consumption of the function, including the retries during consumption due to concurrency overrun. You can flexibly adjust the reserved concurrency of the function to control its consumption speed, i.e., the stress on the downstream backend.

Stress test configuration

Function configuration
a. Memory: 128 MB, with async execution and status tracking enabled but log delivery disabled
b. Concurrency quota: X * 128 MB
c. Duration: 100 ms
d. Burst: 2,000
Testing tool
Use the ab tool to simulate messages in COS and invoke the function through a COS trigger.


Stress test task

Each stress test task needs to start from cold start with no hot instances present.

Result analysis

1. When the concurrency quota is increased from 1,000 to 2,000, it can be seen that the processing speed of the function, including function concurrency, increases significantly; therefore, when there are many async messages, increasing the concurrency quota can increase the message processing speed.
2. When the concurrency quota is increased from 2,000 to 4,000, the message processing speed and function concurrency also increase, but the overall processing time is also below 2m, which indicates that when there are 1,000,000 async messages, the 4,000 quota value certainly can increase the message processing speed and function concurrency, but the 2,000 quota value can basically meet the needs in the scenario.
3. The burst of 2,000 will be overrun when the concurrency quota is 4,000 and there are a high number of messages.

Conclusion

In scenarios involving high numbers of async messages, increasing the function concurrency quota can significantly increase the message processing speed, which is in line with the expectations. You can flexibly control function concurrency so as to control the consumption speed of async messages.

Notes

Provisioned concurrency is available free of charge during the beta test. This feature is expected to be officially launched in November 2021, and small fees will be charged when provisioned concurrent instances are idle, and fees will be charged based on the actual execution duration of requests when they process requests. For more information, please see Billing Overview.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon