Dynamic provisioned concurrency metric is an elastic policy for provisioned concurrency. SCF will periodically collect information about actual concurrent function executions and control the dynamic scaling of the provisioned concurrency feature based on the configured metrics of maximum concurrency, minimum concurrency, and target concurrency usage. This makes the number of provisioned concurrent function instances closer to the actual resource usage, improves the usage of provisioned concurrent instances, and reduces the fees incurred by idle resources. If the number of concurrent instances actually required by a function exceeds that configured dynamic provisioned concurrency metric, auto scaling will be performed as needed.
When the dynamic provisioned concurrency metric is configured, scaling will be performed according to the configured dynamic policy. If the metrics of minimum concurrency, maximum concurrency, and concurrency usage are set, the system will guarantee the minimum concurrency of provisioned resources, and the provisioned concurrency will be dynamically scaled between the minimum and maximum values.
Scaling policy
Concurrency expansion: The system will expand the concurrency when the actual number of business requests increases and triggers the threshold for concurrency expansion until the maximum concurrency is reached. For excessive requests, the concurrency will be expanded as needed.
Concurrency expansion frequency: Concurrency expansion will be performed once every ten seconds, without a time window.
Concurrency reduction: The system will reduce the concurrency when the actual number of business requests drops and triggers the threshold for concurrency reduction until the minimum concurrency is reached.
Concurrency reduction frequency: A time window of ten minutes is provided to implement a relatively conservative concurrency reduction process; that is, concurrency reduction operations will not be performed repeatedly within the time window, which can be understood as the cooling time for releasing a skill in a game. If not performed previously, a concurrency reduction operation can be performed in ten seconds.
Target provisioned concurrency value
The target provisioned concurrency value is jointly determined by the metrics of current concurrency and the target concurrency usage.
Concurrency usage
The concurrency usage of a function refers to the ratio of the number of concurrent requests being responded to by the current function instances to the current total number of function instances. Its value range is [0,1).
Minimum concurrency
The minimum concurrency refers to the minimum required number of provisioned concurrent instances of a function, i.e., the lower limit for concurrency reduction.
Maximum concurrency
The maximum concurrency refers to the maximum number of provisioned concurrent instances of a function, i.e., the upper limit for concurrency expansion.
When updating the dynamic provisioned concurrency metric, you can modify Provisioned concurrency type, Min concurrency, Max concurrency, and Concurrency usage metric.
Note:Basic provisioned concurrency and dynamic provisioned concurrency metric are supported for the provisioned concurrency type. After the provisioned concurrency type is updated, the previously set type will become invalid.
Was this page helpful?