tencent cloud

Feedback

Introduction

Last updated: 2024-12-11 15:15:39

Product Introduction

ASR provides developers with the best experience in converting speech to text. It offers three services, including real-time speech recognition, single-sentence recognition, and recording recognition, to meet the needs of different developers. It has many strengths such as high recognition accuracy, easy integration, and stable performance.

Product Features

** Real-Time Speech Recognition **
Recognize real-time audio streams to achieve the "real-time speech-to-text" effect. This service can be applied to real-time audio streaming scenarios such as voice input and telecommunication chatbots.

** Single-Sentence Recognition **
Recognize audio with a duration of no more than 60 seconds quickly and accurately. This service can be applied to scenarios such as voice message transcription.

** Recording Recognition **
Recognize non-real-time recordings with a long duration. This service can be applied to scenarios such as subtitle generation and recording transcription.

Product Strengths

Accumulation of Massive Data
ASR has accumulated massive data of tagged audio with a total duration of hundreds of thousands of hours based on Tencent's social platforms. It also has rich and diverse corpuses, laying a data foundation for high recognition accuracy.

** Industry-Leading Algorithms **
ASR adopts multiple sequence neural network structures (LSTM, Attention Model, and DeepCNN), multitask learning, and the T-S model. It delivers industry-leading recognition accuracy in both general and vertical fields.

** Support for Various Devices **
ASR provides RESTful APIs and SDKs to support various devices and terminals, including smart hardware, mobile applications, websites, desktop clients, and IoT devices.

** Support for Various Languages **
ASR supports recognizing audio in Mandarin, English, Cantonese, and Korean. It will support more languages and dialects in the future.

** High Recognition Accuracy in Noisy Environments **
ASR has engines with high robustness and recognition accuracy. It can recognize audio in noisy environments with no need for noise reduction.

** Verification Through Extensive Internal and External Business Use **
ASR has been fully verified through its use in many Tencent's internal businesses such as WeChat, Tencent Video, and Honor of Kings. In addition, it has been implemented in business scenarios of many external customers on the Internet, finance, education, and other industries, serving billions of users every day with a stable performance.

Use Cases

** Voice Input **
ASR can recognize voice input content in real time to save input time and improve the input experience for users.

** Voice Message Transcription **
ASR can use the single-sentence recognition service to convert users' voice messages into text, which improves users' reading efficiency.

** Subtitle Generation **
ASR can use the recording recognition service to convert audio content in live streams and recorded videos into text, making it easier to generate subtitles.

** Meeting Minutes **
ASR can use the real-time audio recognition service to convert audio content in meetings, court trials, interviews, and other scenarios into text. This helps reduce human recording costs and improve efficiency.

** Telephone Call Quality Inspection **
ASR can use the real-time audio recognition or recording recognition service to convert telephone call content into text, covering all content to be inspected and enhancing inspection efficiency.