tencent cloud

Feedback

Subtitle Generation and Translation

Last updated: 2024-11-29 14:12:27

    Scenario 1: Offline File Processing

    1. Zero-Code Automatic Generation

    1. Log in to the Media Processing Service (MPS) console and click Create Task > Create VOD Processing Task.
    
    
    
    1.1 Specify an input file.
    You can choose a video file from a Tencent Cloud Object Storage (COS) bucket or provide a video download URL. The current subtitle generation and translation feature does not support using AWS S3 as an input file source.
    1.2 Process the input file.
    Select Create Orchestration. Automatic Speech Recognition (ASR) and speech translation capabilities can be achieved by inserting an intelligent identification orchestration node. Click
    
    to choose a system preset template based on the actual business scenario, or create a custom template.
    
    
    
    The system preset templates and capabilities are shown in the table below:
    Template ID
    Template Capability
    10101
    Identifies Chinese voice in the source video and generates a Chinese subtitle file (VTT format).
    10102
    Identifies English voice in the source video and generates an English subtitle file (VTT format).
    10103
    Identifies Chinese voice in the source video, translates it into English, and generates a Chinese-English bilingual subtitle file.
    10104
    Identifies English voice in the source video, translates it into Chinese, and generates an English-Chinese bilingual subtitle file.
    10105
    Identifies Japanese voice in the source video and generates a Japanese subtitle file.
    10106
    Identifies Korean voice in the source video and generates a Korean subtitle file.
    Note:
    If other parameter configurations are needed, you can create a custom intelligent identification template or submit a ticket for backend configuration.
    1.3 Specify an output path.
    Select a save path for the output file from COS.
    1.4 Initiate a task.
    Click Create to initiate a task.
    2. After the task is completed, the automatically generated VTT subtitle file can be found in Orchestration > COS Bucket > Output Bucket.
    
    
    
    Sample Chinese subtitles:
    
    
    
    Sample Chinese-English subtitles:
    
    
    

    2. API Calling

    1. Enter the orchestration ID in ScheduleId to initiate a task. For details, see API information.
    Example:
    {
    "InputInfo": {
    "Type": "COS",
    "CosInputInfo": {
    "Bucket": "facedetectioncos-125*****11",
    "Region": "ap-guangzhou",
    "Object": "/video/123.mp4"
    }
    },
    "ScheduleId": 20073,
    "Action": "ProcessMedia",
    "Version": "2019-06-12"
    }
    2. If a callback address is set, refer to ParseNotification for the response packet.

    3. Embedding into Video (Optional)

    You need to initiate a transcoding task and specify the subtitle VTT file generated in either Step 1 or Step 2 through the SubtitleTemplate field. For details, see Data Structure.
    Example:
    {
    "MediaProcessTask": {
    "TranscodeTaskSet": [
    {
    "Definition": 206390,
    "OverrideParameter": {
    "Container": "mp4",
    "RemoveVideo": 0,
    "RemoveAudio": 0,
    "VideoTemplate": {
    "Codec": "libx264",
    "Fps": 30,
    "Bitrate": 2346,
    "ResolutionAdaptive": "close",
    "Width": 1920,
    "Height": 0,
    "Gop": 0,
    "FillType": "black"
    },
    "AudioTemplate": {
    "Codec": "libmp3lame",
    "Bitrate": 0,
    "SampleRate": 32000,
    "AudioChannel": 2
    },
    "SubtitleTemplate": {
    "Path": "https://lily-125*****27.cos.ap-nanjing.myqcloud.com/mps_autotest/subtitle/1.vtt",
    "StreamIndex": 2,
    "FontType": "simkai.ttf",
    "FontSize": "10px",
    "FontColor": "0xFFFFFF",
    "FontAlpha": 0.9
    }
    }
    }
    ]
    },
    "InputInfo": {
    "Type": "URL",
    "UrlInputInfo": {
    "Url": "https://lily-125*****27.cos.ap-nanjing.myqcloud.com/mps_autotest/subtitle/123.mkv"
    }
    },
    "OutputStorage": {
    "Type": "COS",
    "CosOutputStorage": {
    "Bucket": "lily-125*****27",
    "Region": "ap-nanjing"
    }
    },
    "OutputDir": "/mps_autotest/output2/",
    "Action": "ProcessMedia",
    "Version": "2019-06-12"
    }

    Scenario 2: Live Streams

    There are currently 2 solutions for using subtitles and translations in live streams: Enable the subtitle feature through the Cloud Streaming Services (CSS) console, or use MPS to call back text and embed it into live streams. It is recommended to enable the subtitle feature through the CSS console. The solution is introduced as follows:

    Solution 1: Enabling the Subtitle Feature in the CSS Console

    1. Configure the live subtitling feature.
    1.1 Enable CSS and MPS.
    1.2 Log in to the CSS console, create a subtitle template, and bind the transcoding template.
    2. Obtain subtitle streams.
    When the transcoding stream (append the transcoding template name _transcoding template name bound with the subtitle template to the corresponding live stream's StreamName to generate a transcoding stream address) is obtained, subtitles will be displayed. For detailed rules of splicing addresses for obtaining streams, see Splicing Playback URLs.
    Note:
    Currently, there are 2 forms of subtitle display: real-time dynamic subtitles and delayed steady-state subtitles. For real-time dynamic subtitles, the subtitles in live broadcast will dynamically correct the content word by word based on the speech content, and the output subtitles change in real time. For delayed steady-state subtitles, the system will display the live broadcast with a delay according to the set time, but the viewing experience of the complete sentence subtitle mode is better.

    Solution 2: Calling Back Text through MPS

    1. Initiate a task via API. Use a preset subtitle template to initiate a recognition task. For details, see ProcessLiveStream.
    Example:
    {
    "Url": "http://5000-wenzhen.liveplay.myqcloud.com/live/123.flv",
    "AiRecognitionTask": {
    "Definition": 10101
    },
    "OutputStorage": {
    "CosOutputStorage": {
    "Bucket": "6c0f30dfvodgzp*****0800-10****53",
    "Region": "ap-guangzhou-2"
    },
    "Type": "COS"
    },
    "OutputDir": "/6c0f30dfvodgzp*****0800/0d1409d3456551**********652/",
    "TaskNotifyConfig": {
    "NotifyType": "URL",
    "NotifyUrl": "http://****.qq.com/callback/qtatest/?token=*****"
    },
    "Action": "ProcessLiveStream",
    "Version": "2019-06-12"
    }
    2. For the real-time callback packet, refer to ParseLiveStreamProcessNotification.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support