tencent cloud

Feedback

Column Compression

Last updated: 2024-11-06 18:55:36

    Overview

    Currently, there are row format compression and data page compression. However, when some large fields and many small fields in a table are processed with frequent read-write access to small fields and infrequent access to large fields, these two compression methods will cause unnecessary computational resource waste during read-write access.
    The column compression feature can compress infrequently accessed large fields without compressing frequently accessed small fields. This not only reduces the storage space of the entire row but also improves read-write access efficiency.
    For example, in an employee table: create table employee(id int, age int, gender boolean, other varchar(1000), primary key (id)), when small fields id, age, and gender are accessed frequently and the large field other is accessed infrequently, the column other can be created as a compressed column. Generally, only read-write operations on the column other will trigger compression and decompression of the column, and access to other columns will not trigger compression and decompression of the column. This further reduces the size of row data storage, enabling faster access to frequently accessed small fields and further reducing storage space for infrequently accessed large fields.

    Supported Versions

    Kernel version: TXSQL 8.0 3.1.15 and later.
    Note:
    The column compression feature is enabled by default in kernel version TXSQL 8.0 3.1.15 and later.
    Kernel version TXSQL 5.7 does not support the column compression feature.

    Applicable Scenario

    When a table contains some large fields and many small fields, and the read-write operations for small fields are frequent while the access to large fields is infrequent, large fields can be set as compressed columns.

    Use Instructions

    Supported Data Types

    1. BLOB (including TINYBLOB, MEDIUMBLOB, and LONGBLOB).
    2. TEXT (including TINYTEXT, MEDIUMTEXT, and LONGTEXT).
    3. VARCHAR.
    4. VARBINARY.
    5. JSON.
    The syntax is as follows:
    CREATE TABLE t1(
    id INT PRIMARY KEY,
    b BLOB COMPRESSED
    );
    Or
    CREATE TABLE t1(
    id INT PRIMARY KEY,
    b BLOB COLUMN_FORMAT COMPRESSED
    );
    Compression threshold
    The compression threshold is controlled by the parameter innodb_min_column_compress_length, with a default value of 256. If the original size of a column exceeds this parameter value, it will be compressed. Otherwise, only a compression header will be added without actually compressing the data.
    The supported DDL syntax is as follows:
    create table statements:
    DDL
    Inherit Compression Attribute
    CREATE TABLE t2 LIKE t1;
    Yes
    CREATE TABLE t2 SELECT * FROM t1;
    No
    CREATE TABLE t2(a BLOB) SELECT * FROM t1;
    No
    alter table statements:
    DDL
    Description
    ALTER TABLE t1 MODIFY COLUMN a BLOB;
    Converts a compressed column to a non-compressed column.
    ALTER TABLE t1 MODIFY COLUMN a BLOB COMPRESSED;
    Converts a non-compressed column to a compressed column.

    Parameter Description

    Parameter Name
    Dynamic
    Type
    Default Value
    Value Range
    Description
    innodb_zlib_column_compression_level
    Yes
    UINT
    6
    [0-9]
    ZLIB compression. 0 represents no compression, 1 represents the fastest compression, and 9 represents the highest compression level. The compression speed decreases from 1 to 9, but the compression ratio increases.
    innodb_zstd_column_compression_level
    Yes
    UINT
    3
    [1-22]
    ZSTD compression. 1 represents the fastest compression and 22 represents the highest compression level. The compression speed decreases from 1 to 22, but the compression ratio increases.
    innodb_min_column_compress_length
    Yes
    UINT
    256
    [1, UINT_MAX32]
    Controls the compression threshold, in bytes. If the original length of a column is greater than or equal to this parameter value, it will be compressed. Otherwise, only a compression header will be added without actually compressing the data.

    Multiple Algorithms

    Three compression algorithms: ZLIB, LZ4, and ZSTD are supported. The algorithm can also be omitted, in which case the default algorithm will be ZLIB.
    The syntax is as follows:
    CREATE TABLE t1(
    id INT PRIMARY KEY,
    b BLOB COMPRESSED ALGORITHM = [ZLIB|LZ4|ZSTD]
    );
    Or
    CREATE TABLE t1(
    id INT PRIMARY KEY,
    b BLOB COLUMN_FORMAT COMPRESSED ALGORITHM = [ZLIB|LZ4|ZSTD]
    );

    Compression Algorithms and Compression Levels

    1. ZLIB: Currently, multiple compression levels are available for ZLIB. The parameter is innodb_zlib_column_compression_level, with the value range from 0 to 9. 0 represents no compression, 1 represents the fastest compression, and 9 represents the highest compression level. The default value is 6.
    2. LZ4: Consistent with MySQL's page compression, multiple compression levels are not supported for LZ4. When using the LZ4 compression algorithm, you should note that the maximum original length of LZ4 compression is 231-1, while the maximum length of LONGBLOB is 232-1. When the original length of the compressed data is greater than or equal to 231, ZLIB will be used implicitly for compression.
    3. ZSTD: ZSTD or ZStandard has three types of compression methods. In this context, column compression supports non-streaming normal compression without a dictionary, and it provides multiple compression levels. The parameter is innodb_zstd_column_compression_level, with a value range from 1 to 22. 1 represents the fastest compression and 22 represents the highest compression level. The default value is 3.
    Compression attribute display
    The default compression algorithm is displayed here: ALGORITHM = ZLIB.
    CREATE TABLE t2 (a VARCHAR(100) COMPRESSED) ENGINE=InnoDB;
    
    SHOW CREATE TABLE t2;

    Notes

    In terms of logical export, the create table statement will still include compression keywords. Therefore, it is supported internally in TDSQL-C for MySQL. For other MySQL branches and official versions:
    For official versions earlier than 8.0.22, it can be imported directly.
    For official versions later than or equal to 8.0.22, the compression keywords need to be removed after logical export.
    When other cloud or user data is exported using Data Transfer Service (DTS), there may be compatibility issues during the binlog synchronization process. In this case, DDL statements with compression keywords can be skipped.
    In terms of physical backup, since the fields are already compressed within InnoDB during backup, the version using this backup must also support column compression.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support