The Serverless service has elastic anti-jitter capabilities, which can help users better adapt to different workloads and performance requirements. This document shows you the elastic anti-jitter capabilities of TDSQL-C for MySQL instances in a Serverless cluster.
Background
In the InnoDB storage engine of TDSQL-C for MySQL, the Buffer Pool is a critical memory area for caching data and indexes. When querying data, it will first be searched in the Buffer Pool. If the data is in the Buffer Pool (that is, cached), InnoDB can immediately return the result, thus avoiding the overhead of the disk I/O. Therefore, correctly configuring the Buffer Pool size is particularly important for improving database performance. The Serverless architecture decouples the purchased instance specifications and can be automatically enabled, disabled and scaled based on the actual load of the user database, achieving extreme elasticity in computing resources. The computing resources mainly include CCU (CPU + memory). CPU can be limited by technologies such as cgroup or docker, while memory is allocated to the database process, with the majority used by the Buffer Pool module to cache user data. The allocation and release process of the Buffer Pool memory involves the distribution and relocation of user data and the mutual exclusion of global resources in the kernel. Correctly configuring the Buffer Pool size can enable the TDSQL-C for MySQL Serverless service to provide more stable elastic service capabilities.
MySQL officially supports the dynamic configuration of the Buffer Pool size, which is achieved by directly adjusting the parameter innodb_buffer_pool_size. The task will be completed in the background. If the parameter is changed again before the task is completed, it will be ignored. The scaling-out logic is relatively simple, while the scaling-in logic is relatively complex and is also where bottlenecks are prone to occur, such as I/O bottlenecks, free/lru list mutex bottlenecks, and global lock bottlenecks. Therefore, a series of optimizations at the kernel level have been made for the TDSQL-C for MySQL Serverless service, enabling the database to be "elastic" more stably.
Bottleneck Analysis and Optimization Scheme
I/O Bottleneck
During the test with the official MySQL 8.0, the kernel team has found that the main bottleneck of scaling-in is flushing the lru list, because in most scenarios, the first scan of the free list cannot meet the reclaiming requirements, and the scan depth is determined based on the number of blocks to be reclaimed, which may be a relatively large value. For buf_flush_do_batch, page cleaner and page persistence are needed, during which the lru mutex will also frequently be acquired and released, competition with user threads will occur, and glitches will be generated. Page persistence involves I/O operations, which is the main bottleneck. This issue can be completely avoided in the TDSQL-C for MySQL architecture, because the pages in the distributed storage are asynchronously generated by applying redo logs at the storage layer. For computing nodes, page cleaner is not needed, and the pages that need to be eliminated can be directly discarded. The product architecture is shown in the figure below.
Free/lru List Mutex Bottleneck
In the main process of scaling-in, the free list and the lru list are traversed in each loop, and the corresponding mutex is held during the traversal. At this time, user threads for both reading and writing operations are inaccessible and it may be necessary to acquire the free/lru list mutex. All blocks in the Buffer Pool are maintained on these two linked lists, so the traversal process has an O(N) (N represents the number of blocks) time complexity. This value may be very large, and the mutex will be continuously held for a long time, causing glitches for users.
Optimization Scheme
The optimization policy is to traverse the blocks in the chunk that need to be reclaimed by address. The number of blocks traversed in this way is related to the size of the scaling-in and does not depend on the size of the entire Buffer Pool. The locking range changes from the entire lru linked list to a single block. This will reduce the lock holding range and the lock holding time.
Global Lock Bottleneck
Whether performing scaling-out or scaling-in operations, there is a logic that requires acquiring the global lock of the Buffer Pool. During this period, the Buffer Pool is almost unavailable to users. If this step takes too long, it can also lead to users experiencing a temporary performance drop, known as a "glitch". Through analysis, we have found that there are three main stages in this process that are time-consuming:
Reclaim chunks memory and free blocks mutex.
Allocate chunks memory and initialize blocks.
Resize Hash.
Optimization Scheme
For the first two stages, the kernel team adopts the policy of delaying the release of chunks and pre-allocating chunks in advance, so that the main work can be done outside the buffer pool mutex. The original complexity is O(N), and the initialization of blocks and free blocks result in N being the number of blocks. After optimization, N in O(N) is the number of chunks, and the complexity is relatively controllable. For the Resize Hash stage, this is essentially a Rehash issue. The fundamental solution is to optimize the algorithm, such as lock-free Hash or consistent Hash allocation. The Hash Table is a fundamental element in InnoDB. If this complexity is changed, the risk will be high and the cycle will be long. If the Hash Table is too large, space will be wasted and many cells will not be used. If it is too small, there will be many Hash collisions, impacting performance. The kernel team adopts the policy of exchanging space for time, making the trigger frequency configurable, and reducing or not triggering Resize Hash within a certain scaling range.
Optimization Effect
In the test, use sysbench oltp_read_only, set long_query_time = 0.1, and check the number of slow queries. The comparison before and after optimization is as follows.
Use Instructions
For the above bottlenecks and optimizations, in addition to the optimizations directly performed by the kernel team, we provide relevant parameter settings in the Resize Hash stage, allowing users to choose whether to scale the size of the Hash Table to prevent glitches. The main cause of glitches is the scaling of the size of the Hash Table, so it is possible to prevent glitches without updating the size of the Hash Table. The detailed introduction of the parameters is as follows.
|
innodb_ncdb_decrease_buffer_pool_hash_factor | Yes | No | 2 | 0 | 2 | Adjusting the frequency of the Buffer Pool size thread will reduce the Hash Table size. A value of 2 indicates the disabled state, meaning that the Hash Table size will be updated when the Buffer Pool decreases by more than half. A value of 0 indicates the enabled state, meaning that the Hash Table size will not be updated. Note: When this parameter is in the enabled state, changing the upper limit of node computing power setting will re-enable the instance. |
Kernel Versions Supported by Parameters
Kernel version TXSQL 5.7 2.1.12 and later.
Kernel version TXSQL 8.0 3.1.14 and later.
Parameter Setting Method
Was this page helpful?