Feature | Description |
Increased compatibility | A source video can be transcoded to formats (such as MP4) that are compatible with more types of devices for smooth playback. |
Increased bandwidth adaptability | A source video can be transcoded for output in multiple definitions such as LD, SD, HD, and UHD. End users can select the most appropriate bitrate depending on their network conditions. |
Improved playback efficiency | The moov atom can be moved from the end of an MP4 file to the beginning of the file, allowing the video to be played before it is entirely downloaded. |
Reduced bandwidth consumption | With a more advanced codec (such as H.265), the bitrate of a video can be substantially reduced while retaining the original quality, which helps reduce the bandwidth consumption. |
Category | Parameter | Description |
Input | Container format | 3GP, AVI, FLV, MP4, M3U8, MPG, ASF, WMV, MKV, MOV, TS, WebM, MXF |
| Video codec | AV1, AVS2, H.264/AVC, H.263, H.263+, H.265, MPEG-1, MPEG-2, MPEG-4, MJPEG, VP8, VP9, RealVideo, Windows Media Video, QuickTime |
| Audio codec | AAC, ADPCM, AMR, DSD, MP1, MP2, MP3, PCM, RealAudio, Windows Media Audio, Vorbis, AC-3 |
Output | Container format | Video: FLV, MP4, HLS (M3U8 + TS), MXF |
| | Audio: MP3, MP4, Ogg, FLAC, M4A |
| | Image: GIF, WebP |
| Video codec | H.264/AVC, H.265/HEVC, AV1 |
| Audio codec | MP3, AAC, FLAC, MP2, Vorbis |
Packaging | Delete video streams | If this is enabled, the transcoding result will contain only audio streams. |
| Delete audio streams | If this is enabled, the transcoding result will contain only video streams. |
Capability | Description |
Image noise removal | Removes the random noise introduced from the camera and the environment during video recording while maintaining details of the video image. |
Artifact (glitch) removal | Effectively repairs distortions caused by repeated compressions of videos during transcoding that compromise the visual quality, such as blocking artifacts, ringing artifacts, color contamination, and mosquito noise. |
Banding removal | Repairs banding and snow caused by various factors that affect the film during video recording, storage, or transfer. |
Detail enhancement | Makes the video image clearer by enhancing details which may have been compromised by the camera quality or during video saving or transcoding. |
Overall enhancement | Uses AI-based analysis to improve the overall image quality in videos by balancing image textures, removing compression artifacts, and enhancing key details. |
Super resolution | Enhances and restores details in low-resolution videos that can't meet today's requirements for a high definition. It uses an AI model to output high-resolution videos with clearer details. |
Face enhancement | Uses face detection to enhance the detail and quality of faces in the video. |
Color enhancement | Restores video color that may have been distorted due to camera problems or video storage and enhances the color to to make it more pleasing to viewers. |
Low-light enhancement | Due to the environmental conditions and the hardware limitations of the camera, the video image of certain scenes may lack brightness and contrast, leading to loss of details in dark areas. This feature automatically recognizes scenes and adaptively enhances the video image to increase details and contrast in dark image areas and improve the image quality, especially in low-light scenes. |
HDR | Converts general SDR videos to HDR videos. It can increase the color depth to 10 bits to get a wider gamut and display more color details, providing higher-quality video content. |
Frame interpolation | Adds additional video frames between the original video frames to offer a smoother visual effect, improving image quality in older videos shot at a low frame rate and reducing lag and jitter. |
Parameter | Description |
Type | The watermark type. Watermarks can be static or animated. |
Position | The relative position of a watermark in the video. |
ImageSize | The size of the watermark in the video. |
ImageContent | Binary data of a watermark. |
Parameter | Description |
Format | The screenshot format (only JPG is supported currently) |
Width | Screenshot width (px). Value range: 128-4096 |
Height | Screenshot height (px). Value range: 128-4096 |
FillType | The fill mode ( FillType ) specifies how the source video image processed when the aspect ratio does not match the specified aspect ratio of a screenshot. The following fill modes are supported: Scale to fill: Source video images are stretched to match the aspect ratio of screenshots. This may cause images to appear distorted. Black bars: The aspect ratio of source video images is retained, and the empty spaces are painted black. White bars: The aspect ratio of source video images is retained, and the empty spaces are painted white. Gaussian blur: The aspect ratio of source video images is retained, and Gaussian blur is applied to the empty spaces. |
Parameter | Description |
Format | The screenshot format (only JPG is supported currently) |
Width | Screenshot width (px). Value range: 128-4096 |
Height | Screenshot height (px). Value range: 128-4096 |
SampleType | How sampling intervals are measured. Sampling intervals can be measured in two ways: By percent: Intervals are measured by percent. For example, if Interval is set to 5 (%), 20 screenshots will be generated for a video. By time: Intervals are measured by time. For example, if Interval is set to 10 (sec), the number of screenshots generated will depend on the video length. |
Interval | The sampling interval. If the interval measurement ( SampleType ) is by percent, this parameter is a percent value. If interval measurement is by time, this parameter is a time value (sec). |
FillType | The fill mode ( FillType ) specifies how the source video image processed when the aspect ratio does not match the specified aspect ratio of a screenshot. The following fill modes are supported: Scale to fill: Source video images are stretched to match the aspect ratio of screenshots. This may cause images to appear distorted. Black bars: The aspect ratio of source video images is retained, and the empty spaces are painted black. White bars: The aspect ratio of source video images is retained, and the empty spaces are painted white. Gaussian blur: The aspect ratio of source video images is retained, and Gaussian blur is applied to the empty spaces. |
Parameter | Description |
Format | The format of the image sprite (only JPG is supported currently). |
Width | The width of the subimage in an image sprite. |
Height | The height of the subimage in an image sprite. |
Rows | The number of image rows in a sprite. |
Columns | The number of image columns in a sprite. |
SampleType | How sampling intervals are measured. Currently, only sampling by time is supported. |
Interval | The time interval for image sampling. |
Width
x Columns
(i.e., sprite width) should be within the range of 128-4096.Height
x Rows
(i.e., sprite height) should be in the range of 128-4096.Parameter | Description |
Format | The format of the animated image (only GIF and WebP are supported currently). |
Width | The animated image width. Value range: 128–4096 px. |
Height | The animated screenshot height. Value range: 128–4096 px. |
FPS | The frame rate. Value range: 1–60 fps. |
Recognition Type | Description |
Face Recognition | Quickly recognizes facial information in a video based on deep learning and locates the frames in which a person is present as well as the position of the person’s face. You can use custom person libraries or call video AI-enabled public person libraries to recognize faces. |
Speech recognition | Quickly recognizes the speech in a video and converts it to text based on deep learning. You can specify custom keywords and locate the time points in the video at which the keywords are spoken. |
Text recognition | Recognizes text in a video, including vertically oriented text, and automatically extracts keywords from the text. |
Frame tag recognition | Uses deep learning to automatically recognize tags in the video frames captured at the custom frame capturing interval, and locates the tags in the video. Frame tags are divided into nine categories, such as people, landscape, artificial object, building, plant, animal, and food, covering various aspects of daily life. You can use custom tags based on the tag system. It has transfer learning capabilities, so you can customize classifiers simply by providing the raw user data. In this way, it meets the requirements of different types of users and makes the tag system more flexible. |
Opening and ending credits recognition | Automatically recognizes and locates the time points of opening and ending credits of movies and TV series based on the video image characteristics, text, speech, and other information. |
Analysis Type | Description |
Category recognition | Recommends a category for the target video by analyzing the video content. Currently, it supports 19 categories, including food, travel, animation, and music. Custom categories are also supported as a paid feature. |
Video tag recognition | Intelligently recognizes top five tags that best fit the video content based on Tencent's deep learning solution. It is suitable for video recommendation and search scenarios. You can customize the number of tags to be returned in the API. |
Intelligent thumbnail | Automatically generates a file thumbnail based on characteristic information such as video image texture and scene recognition. It allows you to output static thumbnails quickly, making it easier to create thumbnails for videos and improving video click rates. |
Moderation Type | Detection Type | Detection Item Description |
Security moderation | Video image moderation | Moderates the video image to detect erotic and non-compliant content, specifically including: Erotic content detection `porn`: Pornographic content `vulgar`: Vulgar content `intimacy`: Content that displays intimacy `sexy`: Content that displays sexiness Illegal and non-compliant content detection `guns`: Weapons and guns `bloody`: Bloodiness `explosion`: Explosions and fires `violation_photo`: Banned icons |
| Audio moderation | Moderates the speech in the audio based on the following: Erotic content detection: Analyzes speech in the audio to detect keywords related to erotic content. Illegal and non-compliant content detection: Analyzes speech in the audio to detect keywords related to illegal and non-compliant content. |
| Text moderation | Moderates the text in video images, specifically including: Erotic content detection: Analyzes text in the video image to detect keywords related to erotic content. Illegal and non-compliant content detection: Analyzes text in the video image to detect keywords related to illegal and non-compliant content. |
Quality moderation | Image quality | Detects the following in the video image: JitterResults: Jitter BlurResults: Blur AbnormalLightingResults: Low light or overexposure CrashScreenResults: Blurred screen BlackWhiteEdgeResults: Black bar, white bar, black screen, white screen, and solid color screen durations NoiseResults: Noise MosaicResults: Pixelization QRCodeResults: QR code |
| Audio quality | Detects the following in the speech in the video: VoiceResults: Audio exceptions, including no sound, low volume level, and cracking |
Capability | Description |
Smart splitting | Performs structured analysis on the video content and intelligently splits the video into segments based on scene, speech, and text information. Currently, it is supported for news and ads. |
Smart highlights generation | Based on video temporal/spatial characteristics matching, scene recognition, target detection, and other technologies, it automatically collects video highlights in various video scenes such as soccer, basketball, PlayerUnknown's Battlegrounds, and Honor of Kings. Custom video scenes are supported on a paid basis. |
Editing and production | Allows you to clip and splice videos, convert images into videos, add roll images and text to videos, implement picture-in-picture, and edit audio. |
Was this page helpful?