tencent cloud

Feedback

Speech-to-Text

Last updated: 2024-11-19 16:57:13

    Use Cases

    Tencent Real-Time Communication (TRTC) supports the speech-to-text feature, which converts the audio streams of specified users or all users in a room into corresponding Chinese text for effects such as real-time captions.

    Prerequisites

    Log in to the TRTC console, activate the TRTC service, and create an RTC-Engine application.
    Go to the purchase page to buy an RTC-Engine package of any version to unlock the speech-to-text feature.
    Note:
    The speech-to-text feature incurs fees based on usage. See Fee Details for more information.

    Feature Overview

    After a task is initiated, TRTC AI Service uses an Automatic Speech Recognition (ASR) bot to enter a TRTC room to pull the streams of specified users or all users for speech-to-text recognition, and then relay the recognition results to the client and server in real time.
    

    Integration Guide

    Step 1: Receiving Speech-to-Text Results

    Method 1: Receiving Text Messages via Client SDK

    Use the custom message receiving feature of the TRTC SDK to listen to callbacks on the client and receive real-time speech-to-text result data.
    The client callback message format is as follows, taking the web end as an example:
    trtc.on(TRTC.EVENT.CUSTOM_MESSAGE, event => { // Receive custom messages. // event.userId: The userId of the ASR robot. // event.cmdId: The message ID, which is fixed at 1 for transcriptions and captions. // event.seq: The sequence number of a message. // event.data: ArrayBuffer type. For content of transcriptions or captions, see the explanation of the data field below. const data = new TextDecoder().decode(event.data) // Explanation of the data field is as follows. console.log(`received custom msg from ${event.userId}, message: ${ data }`) })
    Data field explanation

    Real-Time Captions

    Field Name
    Type
    Meaning
    type
    Integer
    10000: When there are real-time captions and a complete sentence, the message type will be delivered.
    sender
    String
    Speaker's userid.
    receiver
    Array
    Recipient's userid list. This message is actually broadcast within a room.
    payload.text
    String
    Recognized text, Unicode encoded.
    payload.start_time
    String
    Message start time. It is the absolute time after a task starts.
    payload.end_time
    String
    Message end time. It is the absolute time after a task starts.
    payload.end
    Boolean
    If true, it indicates that this is a complete sentence.
    {
    "type": 10000,
    "sender": "user_a",
    "payload": {
    "text":"",
    "start_time":"00:00:02",
    "end_time":"00:00:05",
    "end": true
    }
    }
    Note:
    Callback example explanation:
    Transcription: A complete sentence will be transcribed and pushed.
    "How's the weather today?"
    Captions: A sentence will be segmented for pushing, with each subsequent segment containing the previous one to ensure real-time performance.
    "Today"
    "Today's weather"
    "How's the weather today?"
    Sequence explanation: Caption message > Caption message > .... > Caption message (end = true)

    Method 2: Receiving via Server-side Callbacks

    The speech-to-text service also provides server-side event callbacks, facilitating your service to receive real-time conversation messages. See Detailed Callback Events.

    Step 2: Initiating a Speech-to-Text Task

    TRTC provides the following Tencent Cloud APIs for initiating and managing speech-to-text tasks:
    Start a speech-to-text task: StartAITranscription
    Query a speech-to-text task: DescribeAITranscription
    Stop a speech-to-text task: StopAITranscription
    Note:
    The speech-to-text feature has a concurrency limit of 100 tasks per SDKAppId. Submit a ticket if you need to increase this limit.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support