tencent cloud

Feedback

Terminal SDK Feature Introduction and Access

Last updated: 2024-10-15 12:06:40
    The terminal SDK is a suite of audio and video terminal product capabilities launched by Tencent Cloud. It encompasses three types of SDKs for video encoding, audio enhancement, and video enhancement. Tailored to meet diverse customer needs, it supports access from multiple terminals such as mobile, web, and PC.
    
    
    

    Terminal Video Codec SDK

    Tencent's Top Speed Codec (TSC) terminal video encoder is designed for scenarios requiring low computational power, low latency, and high-quality image on the terminal side. Compared with hardware encoding, its advantages include:
    1. Stable, reliable, and quick to start.
    2. At the same quality level, it saves bitrate, enhances transmission stability, reduces downlink distribution bandwidth, and saves on storage costs.
    3. At the same bitrate, it improves image quality and enhances user experience.
    4. A rich set of features meets diverse business needs, such as using Regions of Interest (ROI) encoding to improve the image quality in the face region and dynamically adjusting encoding configuration to adapt to network fluctuations.

    Terminal Audio SDK

    The client audio SDK provides audio encoding and enhancement capabilities. It achieves effects including adaptive noise suppression, acoustic echo cancellation, and automatic gain control, significantly improving audio quality by eliminating echo and noise.
    For details, see TSC Terminal Audio SDK.

    Terminal Enhancement SDK

    The client enhancement SDK, based on efficient image processing algorithms and AI model inference capabilities, achieves terminal video super-resolution, image quality enhancement, frame interpolation, and other features.

    TSC Terminal Video Codec SDK

    Product Overview

    Compared with Video on Demand (VOD) and Cloud Streaming Services (CSS) encoding, terminal-side encoding requires different solutions.
    Encoding Mode
    VOD
    CSS
    Terminal-side Codec
    Typical Business
    WeTV, video account, and other mainstream on-demand services
    Video account live streaming, Tencent sports live streaming, and other mainstream live streaming services
    VooV Meeting, WeChat video call, and 5G remote control services
    Latency Requirements
    Pursues an extreme compression rate, with no latency requirements.
    Pursues a high compression rate, allowing second-level latency.
    Pursues a high compression rate while requiring zero latency.
    Real-Time Requirements
    Pursues an extreme compression rate, with no real-time requirements.
    Allows multi-frame average real-time under multi-threading.
    Requires real-time encoding under single-threading.
    Network Condition Constraints
    Encoding process is unrelated to network status, with fixed encoding configuration.
    Encoding process is unrelated to network status, with fixed encoding configuration.
    Encoding process is strongly related to network status, requiring dynamic adjustment of encoding configuration based on network status.
    Scenario Characteristics
    1 -> N, no interaction
    1 -> N, no interaction
    N <-> N, strong interaction
    Solution
    Server-side encoding
    Server-side encoding
    Terminal-side encoding
    Tencent's Top Speed Codec (TSC) terminal video encoder is designed for scenarios requiring low computational power, low latency, and high-quality image on the terminal side. Compared with hardware encoding, its advantages include:
    1. Stable, reliable, and quick to start.
    2. At the same quality level, it saves bitrate, enhances transmission stability, reduces downlink distribution bandwidth, and saves on storage costs.
    3. At the same bitrate, it improves image quality and enhances user experience.
    4. A rich set of features meets diverse business needs, such as using Regions of Interest (ROI) encoding to improve the image quality in the face region and dynamically adjusting encoding configuration to adapt to network fluctuations.

    SDK Access Process

    
    
    
    1. Evaluation and Trial: Customers provide system platform and demand information, and apply for product trial.
    System platform: Android, iOS, Windows, macOS, etc.
    Use cases: live streaming, VOD
    Encoding specification: encoding format (H.264, H.265), resolution, frame rate, bitrate, latency requirements, etc.
    Optimization objectives: bitrate savings, image quality enhancement, CPU savings, and respective assessment metrics (PSNR, SSIM, VMAF, etc)
    2. Development and Integration: Integrate the beta version of the SDK into the app, for performance evaluation and custom optimization.
    Based on customer effect evaluation results and specific business scenario needs, provide in-depth optimization support.
    3. Launch and Release: Apply for a license, integrate the official version of the SDK with license authorization, and test and launch the app.
    If the license is about to expire or has expired, you can apply for a license renewal.

    SDK Integration

    The video codec SDK is implemented in C/C++/Assembly, providing a unified C interface for various system platforms.

    Android

    ● Provides ARMv7 and ARMv8 version dynamic libraries, and the application is integrated via NDK.
    ● Provides Java interface encapsulation. The interface is basically consistent with Android's hardware encoding MediaCodec, facilitating parallel replacement of MediaCodec.

    iOS

    Provides ARMv8 and x86_64 version XCFramework.

    macOS

    Provides ARMv8 and x86_64 version framework.

    Windows

    Provides x86 and x86_64 version dynamic libraries.

    Basic Video Encoding Process

    
    
    

    TSC Terminal Audio SDK

    Product Overview

    The client audio SDK provides audio encoding and enhancement capabilities, significantly improving audio quality by eliminating echo and noise.
    Details of features for each edition are as follows:
    Feature Point
    Standard Edition
    Professional Edition
    Premium Edition
    Acoustic Echo Cancellation
    Supported
    Supported
    Supported
    Automatic Gain Control
    Supported
    Supported
    Supported
    Adaptive Noise Suppression
    Supported
    Supported
    Supported
    Echo Cancellation Music Mode
    -
    Supported
    Supported
    Volume Equalization
    -
    Supported
    Supported
    AI Intelligent Noise Reduction
    -
    Supported
    Supported
    Audio Encoding
    -
    -
    Supported
    AI Codec
    -
    -
    Supported

    Real-Time Communication Audio 3A

    Audio 3A technology is a set of basic features in sound signal processing, commonly used in real-time communication systems such as video conferencing, calls, and live microphone connections, to ensure high-quality audio signal transmission, and provide better communication quality and audio listening experience. 3A stands for Adaptive Noise Suppression (ANS), Acoustic Echo Cancellation (AEC), and Automatic Gain Control (AGC).
    Real-time communication audio link
    Real-time communication audio link
    
    ANS
    The main feature of ANS is to eliminate the background noise components in the voice signal, reduce interference, and therefore improve speech intelligibility and perceptual quality. Based on the additive noise model assumption, the audio signal captured by the microphone can be considered as a superposition of the pure voice signal and noise interference. By tracking and estimating noise in non-voice segments of the audio, and then subtracting the noise component energy in the voice segments, a clearer voice signal can be obtained.
    AEC
    AEC mainly addresses the echo problem in audio communication. During a call, the sound played by the speaker is directly captured by the microphone or captured after reflection, causing the remote user to hear their own voice. This can seriously affect call quality. AEC technology can process the near-end signal based on the remote reference signal, effectively eliminating or reducing this echo phenomenon, thereby enhancing the call experience.
    AGC
    AGC is responsible for adjusting the volume during the transmission of audio signals. When the volume of the sound source is too low or too high, it can significantly affect the call experience. AGC can automatically detect the loudness of the audio stream and dynamically adjust the volume level to keep it within a comfortable range. AGC can alleviate the volume instability caused by factors such as differences in recording device collection, speaker volume, and distance.

    Use Cases

    The SDK can be applied in the preprocessing of audio encoding in uplink push and the post-processing of audio decoding in downlink pull, to enhance sound quality. Currently, it supports Android, iOS, Windows, and macOS clients.
    
    
    
    
    Online teaching scenario: Eliminating noise and echo enhances the clarity of sound during the teaching process.
    In-game voice scenario: Equalizing loud and soft voices improves player listening experience and game experience.
    Live streaming scenario: Anchor voice noise reduction and voice gain control improve the overall live streaming quality in voice chat, song rooms, and similar scenarios.

    SDK API Calling Process

    
    
    
    

    TSC Terminal Enhancement SDK

    Product Overview

    The client enhancement SDK, based on efficient image processing algorithms and AI model inference capabilities, achieves terminal video super-resolution, image quality enhancement, frame interpolation, and other features.
    Details of features for each edition are as follows:
    Feature Point
    Standard Edition
    Professional Edition
    Premium Edition
    Standard super-resolution
    Supported
    Supported
    Supported
    Standard super-resolution+Enhancement parameters
    (Contrast/Color/Brightness)
    Supported
    Supported
    Supported
    Professional super-resolution
    -
    Supported
    Supported
    AI image quality enhancement
    -
    Supported
    Supported
    AI frame interpolation enhancement
    -
    -
    Supported
    
    
    
    
    
    
    The advantage of the Standard Edition is the performance, with our algorithms achieving good super-resolution effects at minimal time and energy consumption. It is compatible with almost all mobile phones of different performances.
    Additionally, the Standard Edition offers image enhancement features, which can adjust image brightness, color saturation, and contrast.
    The advantage of the Professional Edition is the effect. Using AI model inference, it can regenerate missing texture details in the original image, achieving the best image enhancement and super-resolution effects. The Professional Edition requires computational power of the device and is recommended for use on mid to high-end mobile phones.

    Performance

    Standard super-resolution
    System
    Device Model
    Device Configuration
    Basic Super-Resolution Parameter
    CPU
    (%)
    Memory
    (MB)
    Frame Rate
    GPU
    (%)
    Power Consumption
    (mAh)
    Android
    HUAWEI Mate50 (2022)
    Chip: Snapdragon 8+ Gen1 CPU: 3.0 GHz GPU: Adreno 730 Battery: 4272.8 mAh
    720P - Off
    2.8
    48
    59.9
    5
    138.01
    720P x 1.5
    3
    64
    60.4
    10
    196.55
    576P x 1.25
    3
    60.1
    59.9
    7
    /
    4K x 1.25
    3
    163.2
    59.9
    46.4
    /
    Android
    Sony Xperia 5 II (2020)
    Chip: Snapdragon 865 CPU: 2.84 GHz GPU: Adreno 650 Battery: 3104 mAh
    720P - Off
    1
    135.9
    59.1
    4
    133.78
    720P x 1.5
    2
    146.8
    59.2
    10
    152.41
    576P x 1.25
    2
    139.2
    59.2
    6
    /
    4K x 1.25
    2
    311.2
    59.2
    46.7
    /
    Android
    Xiaomi 6 (2017)
    Chip: Snapdragon 835 CPU: 2.45 GHz GPU: Adreno 540
    720P x 1.5
    2.9
    119
    60
    18.9
    /
    Android
    Redmi Note 4 (2016)
    Chip: MediaTek MT6797 Helio X20 CPU: MT6797 2.0 GHz GPU: ARM Mali-T880
    720P x 1.5
    9.4
    137.9
    60.6
    74.5
    /
    Android
    Honor 8 Youth Edition (2016, budget phone)
    Chip: HiSilicon Kirin 655 CPU: HI6250 2.3 GHz GPU: ARM Mali-T830
    720P - Off
    2
    77
    58.8
    Not supported
    /
    720P x 1.5
    2
    83.4
    58.1
    Not supported
    /
    iOS
    iPhone 13 (2021)
    CPU: 3.23 GHz GPU: quad-core Battery: 3065.65 mAh
    720P - Off
    5.9
    54.4
    59.5
    15.9
    64.99
    720P x 1.5
    6
    63.8
    59.5
    24
    88.29
    576P x 1.25
    4.7
    57.3
    59.5
    18.9
    /
    4K x 1.25
    9.2
    162.2
    59.5
    60.6
    /
    iOS
    iPhone 6P (2014)
    CPU: Apple A9 GPU: PowerVR GT7600
    720P - Off
    13
    40.5
    59.5
    22.8
    /
    720P x 1.5
    18.8
    49.4
    59.6
    50.2
    /
    Professional super-resolution
    System
    Device Model
    Device Configuration
    Professional Super-Resolution Parameter
    CPU
    (%)
    Memory
    (MB)
    Frame Rate
    GPU
    (%)
    Power Consumption
    (mAh)
    Android
    HUAWEI Mate50 (2022)
    Chip: Snapdragon 8+ Gen1 CPU: 3.0 GHz GPU: Adreno 730 Battery: 4272.8 mAh
    720P - Off
    3
    66
    60
    3
    138.01
    720P x 1.5
    13
    123
    48
    10
    342.9
    576P x 1.25
    13
    105
    60
    7
    333.13
    540P x 2
    13
    105
    60
    11
    322.73
    Android
    Sony Xperia 5 II (2020)
    Chip: Snapdragon 865 CPU: 2.84 GHz GPU: Adreno 650 Battery: 3104 mAh
    720P - Off
    1
    142
    59.1
    3
    133.78
    720P x 1.5
    13
    196
    39
    8
    294.06
    576P x 1.25
    13
    148
    58
    8
    /
    540P x 2
    13
    159
    40
    7
    /
    iOS
    iPhone 13 (2021)
    CPU: 3.23 GHz GPU: quad-core Battery: 3065.65 mAh
    720P - Off
    6
    73
    60
    14
    64.99
    720P x 1.5
    15
    94
    40
    14
    /
    576P x 1.25
    10
    84
    60
    16
    /
    540P x 2
    9
    76
    60
    21
    /
    AI image quality enhancement
    System
    Device Model
    Device Configuration
    Professional Enhancement Resolution
    CPU
    (%)
    Memory
    (MB)
    Frame Rate
    GPU
    (%)
    Android
    HUAWEI Mate50 (2022)
    Chip: Snapdragon 8+ Gen1 CPU: 3.0 GHz GPU: Adreno 730 Battery: 4272.8 mAh
    720P
    13
    140
    55
    7
    576P
    13
    126
    74
    5
    540P
    13
    130
    78
    7
    Android
    Sony Xperia 5 II (2020)
    Chip: Snapdragon 865 CPU: 2.84 GHz GPU: Adreno 650 Battery: 3104 mAh
    720P
    13
    184
    41
    5
    576P
    13
    174
    59
    5
    540P
    13
    142
    43
    4
    iOS
    iPhone 13 (2021)
    CPU: 3.23 GHz GPU: quad-core Battery: 3065.65 mAh
    720P
    17
    91
    40
    11
    576P
    12
    70
    60
    11
    540P
    9
    68
    60
    11

    Use Cases

    1. Enhance terminal players to improve video playback quality and smoothness.
    
    
    
    2. Save costs by reducing the resolution and bitrate of video distribution, and then minimize experience loss through terminal playback enhancement.
    
    
    
    For example, in cloud gaming scenarios, the capability of real-time video super-resolution on the terminal can reduce the computational power of cloud rendering and encoding, save transmission bandwidth, and save costs. In the following example, a game scene transmitted from the cloud at 720P (5.6Mbps) is up-scaled to 1080P in real-time on the terminal. The viewing effect is close to a scene transmitted directly at 1080P (8.2Mbps) from the cloud, saving 30% of bandwidth.

    SDK Integration

    Compatibility

    Android platform: Applicable to Android 5.0 and later (API 21, OpenGL ES 3.1).
    iOS platform: Applicable to iPhone 5s and later versions of devices, with the minimum system version being iOS 12.

    Package Size

    Standard Edition: Android AAR is approximately 0.3 MB (arm64-v8a), and iOS Framework is approximately 0.4 MB.
    Professional Edition: Android AAR is approximately 2.1 MB (Single arm64-v8a architecture), and iOS Framework is approximately 1.9 MB.

    Integration Guide

    Please refer to the Android and iOS integration guides.
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support