Smart Subtitle Access Tutorial

Media Processing Service

Release Notes and Announcements

Release Notes

Announcements

MPS to Bill Audio/Video Enhancement, Intelligent Recognition, and Intelligent Analysis at Finer Granularity

Release of MPS 2.0

Announcement on the Official Billing for MPS Screencapturing and Animated Screenshot Generation Features

Product Introduction

Purchase Guide

Billing Overview

Prepaid Resource Packs

Pay-As-You-Go

Client SDK Billing Instructions

Purchase Instructions

Console Guide

Integration Tutorials

Media AI Integration Tutorial

Smart Subtitle Access Tutorial

LLM Summarize Tutorial

Intelligent Highlights Tutorial

Smart Erase Tutorial

Video Splitting (Long Videos to Short Videos) Tutorial

Horizontal-to-Vertical Video Transformation Tutorial

Media Quality Inspection Integration

MPS Live Stream Recording integration

Terminal SDK integration

DRM integration

Other tutorials

Filename Variable

Using Amazon S3 Buckets with MPS

MPS Task Callback Notification

MPS Task Callback Backup to COS

Application Scenario and Practical Tutorial

Short Drama Translation Scenario

Image Quality Improvement Scenario

Audio/Video Cost Optimization Scenario

AI-Generated Content Scenario

Online Education Scenarios

API Documentation

Making API Requests

Processing Task Initiation APIs

DescribeMediaMetaData

Task Management APIs

DescribeTaskDetail

DescribeTasks

ManageTask

Transcoding and Enhancement Template APIs

CreateTranscodeTemplate

CreateAdaptiveDynamicStreamingTemplate

DeleteTranscodeTemplate

DeleteAdaptiveDynamicStreamingTemplate

DescribeTranscodeTemplates

DescribeAdaptiveDynamicStreamingTemplates

ModifyTranscodeTemplate

ModifyAdaptiveDynamicStreamingTemplate

Watermark Template APIs

CreateWatermarkTemplate

DeleteWatermarkTemplate

DescribeWatermarkTemplates

ModifyWatermarkTemplate

Screenshot Template APIs

CreateAnimatedGraphicsTemplate

CreateSnapshotByTimeOffsetTemplate

CreateSampleSnapshotTemplate

CreateImageSpriteTemplate

DeleteAnimatedGraphicsTemplate

DeleteSnapshotByTimeOffsetTemplate

DeleteSampleSnapshotTemplate

DeleteImageSpriteTemplate

DescribeAnimatedGraphicsTemplates

DescribeSnapshotByTimeOffsetTemplates

DescribeSampleSnapshotTemplates

DescribeImageSpriteTemplates

ModifyAnimatedGraphicsTemplate

ModifySnapshotByTimeOffsetTemplate

ModifySampleSnapshotTemplate

ModifyImageSpriteTemplate

Media AI Template APIs

CreateSmartSubtitleTemplate

CreateContentReviewTemplate

CreateAIAnalysisTemplate

CreateAIRecognitionTemplate

DeleteSmartSubtitleTemplate

DeleteContentReviewTemplate

DeleteAIAnalysisTemplate

DeleteAIRecognitionTemplate

DescribeSmartSubtitleTemplates

DescribeContentReviewTemplates

DescribeAIAnalysisTemplates

DescribeAIRecognitionTemplates

ModifySmartSubtitleTemplate

ModifyContentReviewTemplate

ModifyAIAnalysisTemplate

ModifyAIRecognitionTemplate

Media AI—Hotword Lexicon APIs

CreateAsrHotwords

DeleteAsrHotwords

DescribeAsrHotwords

DescribeAsrHotwordsList

ModifyAsrHotwords

Media AI—Sample Management APIs

DescribePersonSamples

DescribeWordSamples

ModifyPersonSample

ModifyWordSample

Media Quality Inspection Template APIs

CreateQualityControlTemplate

DeleteQualityControlTemplate

DescribeQualityControlTemplates

ModifyQualityControlTemplate

Live Streaming Recording Template APIs

CreateLiveRecordTemplate

DeleteLiveRecordTemplate

DescribeLiveRecordTemplates

ModifyLiveRecordTemplate

Orchestration Management APIs

StreamLink—Security Group Management APIs

DescribeStreamLinkSecurityGroup

Parse Notification APIs

ParseLiveStreamProcessNotification

Other APIs

FAQs

Account Authorization

Task Configuration

Task Initiation

Task Result Viewing

Related Agreement

Service Level Agreement

Media Processing Service Policy

Data Processing And Security Agreement

Glossary

DocumentationMedia Processing ServiceIntegration TutorialsMedia AI Integration TutorialSmart Subtitle Access Tutorial

Smart Subtitle Access Tutorial

Download PDF

Last updated: 2025-04-15 16:19:01

Smart Subtitle Access Tutorial

Last updated: 2025-04-15 16:19:01

Download PDF

Overview
The Smart Captions and Subtitles Function offers real-time voice recognition for video files and live streams, converting speech to subtitles in multiple languages. It's ideal for live broadcasts and international video transcription, with customizable hotwords and glossary libraries for improved accuracy.
Key features
Comprehensive Platform Support: Offers processing capabilities for on-demand files, live streams, and RTC streams. Live broadcast real-time simultaneous captioning supports steady and gradient modes, with a low barrier to integration and no need for modifications on the playback end.
High Accuracy: Utilizes large-scale models, and supports hotwords and glossary databases, achieving industry-leading accuracy.
Rich Language Variety: Supports hundreds of languages, including various dialects. Capable of recognizing mixed-language speech, such as combinations of Chinese and English.
Customizable Styles: Enables embedding open subtitles into videos, with customizable subtitle styles (font, size, color, background, position, etc.).
﻿
﻿
﻿
Scenario 1: Processing Offline Files
Method 1: Initiating a Zero-Code Task from the Console
Initiating a Task Manually
Log in to the Media Processing Service (MPS) console and click Create Task > Create VOD Processing Task.
﻿
﻿
﻿
1. Specify an input file.
You can choose a video file from a Tencent Cloud Object Storage (COS) bucket or provide a video download URL. The current subtitle generation and translation feature does not support using AWS S3 as an input file source.
2. Process the input file.
Select Create Orchestration and insert the "Smart Subtitle" node.
﻿
You can choose a preset template or use custom parameters. For a detailed template configuration guide, see Smart Subtitle Template and Custom Hotword Lexicon.
﻿
System preset templates are shown in the table below:
Template Name/ID
Template Capability
Generate_Chinese_Subtitle_For_Chinese_Video
100
Recognize the Chinese speech in the source video and generate a Chinese subtitle file (WebVTT format).
Generate_English_Subtitle_For_Chinese_Video
121
Recognize the Chinese speech in the source video, translate it into English, and generate an English subtitle file.
Generate_Chinese_And_English_Subtitle_For_Chinese_Video
122
Recognize the Chinese speech in the source video, translate it into English, and generate a Chinese-English bilingual subtitle file.
Generate_English_Subtitle_For_English_Video
200
Recognize the English speech in the source video and generate an English subtitle file.
Generate_Chinese_Subtitle_For_English_Video
211
Recognize the English speech in the source video, translate it into Chinese, and generate a Chinese subtitle file.
Generate_Chinese_And_English_Subtitle_For_English_Video
212
Recognize the English speech in the source video, translate it into Chinese, and generate an English-Chinese bilingual subtitle file.
3. Specify an output path.
Specify the storage path of the output file.
4. Initiate a task.
Click Create to initiate a task.
Automatically Triggering a Task Through the Orchestration (Optional)
If you want to upload a video file to the COS bucket and achieve automatic smart subtitles according to preset parameters, you can:
1. Enter On-demand Orchestration in the menu, click Create VDD Orchestration, select the smart subtitle node in task configuration, and configure parameters such as the bucket and directory to be triggered.
﻿
2. Go to the On-demand Orchestration list, find the new orchestration, and enable the switch at Enable. Subsequently, any new video files added to the triggered directory will automatically initiate tasks according to the preset process and parameters of the orchestration, and the processed video files will be saved to the output path configured in the orchestration.
Note:
It takes 3-5 minutes for the orchestration to take effect after being enabled.
﻿
Method 2: API Call
Method 1
Call the ProcessMedia API and initiate a task by specifying the Template ID. Example:
{
    "InputInfo": {
        "Type": "URL",
        "UrlInputInfo": {
            "Url": "https://test-1234567.cos.ap-guangzhou.myqcloud.com/video/test.mp4" // Replace it with the video URL to be processed.
        }
    },
    "SmartSubtitlesTask": {
        "Definition": 122 //122 is the ID of the preset Chinese source video—generate Chinese and English subtitles template, which can be replaced with the ID of a custom smart subtitle template.
    },
    "OutputStorage": {
        "CosOutputStorage": {
            "Bucket": "test-1234567",
            "Region": "ap-guangzhou"
        },
        "Type": "COS"
    },
    "OutputDir": "/output/",
    "Action": "ProcessMedia",
    "Version": "2019-06-12"
}
Method 2
Call the ProcessMedia API and initiate a task by specifying the Orchestration ID. Example:
{
    "InputInfo": {
        "Type": "COS", 
        "CosInputInfo": {
            "Bucket": "facedetectioncos-125*****11", 
            "Region": "ap-guangzhou", 
            "Object": "/video/123.mp4"
        }
    }, 
    "ScheduleId": 12345, //Replace it with a custom orchestration ID. 12345 is a sample code and has no practical significance.
    "Action": "ProcessMedia", 
    "Version": "2019-06-12"
}
Note:
If there is a callback address set, see the ParseNotification document for response packets.
Subtitle Application to Videos (Optional Capability)
Call the ProcessMedia API, initiate a transcoding task, specify the vtt file path for the subtitle, and specify subtitle application styles through the SubtitleTemplate field.
Example:
{
    "MediaProcessTask": {
        "TranscodeTaskSet": [
            {
                "Definition": 100040, //Transcoding template ID. It should be replaced with the transcoding template you need.
                "OverrideParameter": { //Overwriting parameters that are used for flexibly overwriting some parameters in the transcoding template.
                    "SubtitleTemplate": { //Subtitle application configuration.
                        "Path": "https://test-1234567.cos.ap-nanjing.myqcloud.com/mps_autotest/subtitle/1.vtt", 
                        "StreamIndex": 2, 
                        "FontType": "simkai.ttf", 
                        "FontSize": "10px", 
                        "FontColor": "0xFFFFFF", 
                        "FontAlpha": 0.9
                    }
                }
            }
        ]
    }, 
    "InputInfo": { //Input information.
        "Type": "URL", 
        "UrlInputInfo": {
            "Url": "https://test-1234567.cos.ap-nanjing.myqcloud.com/mps_autotest/subtitle/123.mkv"
        }
    }, 
    "OutputStorage": { //Output bucket.
        "Type": "COS", 
        "CosOutputStorage": {
            "Bucket": "test-1234567", 
            "Region": "ap-nanjing"
        }
    }, 
    "OutputDir": "/mps_autotest/output2/", //Output path.
    "Action": "ProcessMedia", 
    "Version": "2019-06-12"
}
Querying Task Results
Via the Console
Navigate to the VOD Tasks in the console, where the list will display the tasks that have just been initiated.
﻿
When the subtask status is "Successful", clicking on View Result allows for a preview of the subtitle.
﻿
The generated VTT subtitle file can be found in Orchestration > COS Bucket > Output Bucket.
﻿
﻿
﻿
Sample Chinese-English subtitles:
﻿
﻿
﻿
Event Notification Callbacks
When initiating a media processing task with ProcessMedia, you can configure event callbacks through the TaskNotifyConfig parameter. Upon the completion of the task, the results will be communicated back to you via the configured callback information, which you can decipher using ParseNotification.
Querying Task Results by Calling an API
Call the DescribeTaskDetail API and fill in the task ID (for example, 24000022-WorkflowTask-b20a8exxxxxxx1tt110253 or 24000022-ScheduleTask-774f101xxxxxxx1tt110253) to query task results. Example:
﻿
Scenario 2: Live Streams
There are currently 2 solutions for using subtitles and translations in live streams: Enable the subtitle feature through the Cloud Streaming Services (CSS) console, or use MPS to call back text and embed it into live streams. It is recommended to enable the subtitle feature through the CSS console. The solution is introduced as follows:
Method 1: Enabling the Subtitle Feature in the CSS Console
1. Configure the live subtitling feature.
1.1 Enable CSS and MPS.
1.2 Log in to the CSS console, create a subtitle template, and bind the transcoding template.
2. Obtain subtitle streams.
When the transcoding stream (append the transcoding template name _transcoding template name bound with the subtitle template to the corresponding live stream's StreamName to generate a transcoding stream address) is obtained, subtitles will be displayed. For detailed rules of splicing addresses for obtaining streams, see Splicing Playback URLs.
Note:
Currently, there are 2 forms of subtitle display: real-time dynamic subtitles and delayed steady-state subtitles. For real-time dynamic subtitles, the subtitles in live broadcast will dynamically correct the content word by word based on the speech content, and the output subtitles change in real time. For delayed steady-state subtitles, the system will display the live broadcast with a delay according to the set time, but the viewing experience of the complete sentence subtitle mode is better.
Method 2: Calling Back Text through MPS
Currently, it is not supported to use the MPS console to initiate live stream smart subtitle tasks. You can initiate them through the API.
Below are usage examples. For detailed API documentation, see ProcessLiveStream. For the real-time callback package, see ParseLiveStreamProcessNotification.
Note:
Currently, using MPS to process live streams requires the use of the Intelligent Identification template. This is achieved using Automatic Speech Recognition or speech translation.
{
    "Url": "http://5000-wenzhen.liveplay.myqcloud.com/live/123.flv", 
    "AiRecognitionTask": {
        "Definition": 10101 //10101 is the preset Chinese subtitle template ID, which can be replaced with the ID of a custom intelligent identification template.
    }, 
    "OutputStorage": {
        "CosOutputStorage": {
            "Bucket": "6c0f30dfvodgzp*****0800-10****53", 
            "Region": "ap-guangzhou-2"
        }, 
        "Type": "COS"
    }, 
    "OutputDir": "/6c0f30dfvodgzp*****0800/0d1409d3456551**********652/", 
    "TaskNotifyConfig": {
        "NotifyType": "URL", 
        "NotifyUrl": "http://****.qq.com/callback/qtatest/?token=*****"
    }, 
    "Action": "ProcessLiveStream", 
    "Version": "2019-06-12"
}

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

Template Name/ID	Template Capability
Generate_Chinese_Subtitle_For_Chinese_Video 100	Recognize the Chinese speech in the source video and generate a Chinese subtitle file (WebVTT format).
Generate_English_Subtitle_For_Chinese_Video 121	Recognize the Chinese speech in the source video, translate it into English, and generate an English subtitle file.
Generate_Chinese_And_English_Subtitle_For_Chinese_Video 122	Recognize the Chinese speech in the source video, translate it into English, and generate a Chinese-English bilingual subtitle file.
Generate_English_Subtitle_For_English_Video 200	Recognize the English speech in the source video and generate an English subtitle file.
Generate_Chinese_Subtitle_For_English_Video 211	Recognize the English speech in the source video, translate it into Chinese, and generate a Chinese subtitle file.
Generate_Chinese_And_English_Subtitle_For_English_Video 212	Recognize the English speech in the source video, translate it into Chinese, and generate an English-Chinese bilingual subtitle file.

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service free trial

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

E-commerce

E-commerce retail solutions

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Financial Services

Financial Services Solution

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha

Cloud Workload Protection Platform

Data Security Governance Center

Key Management Service