tencent cloud

$0 14-Day TrialExperience EdgeOne for acceleration and security protection!

Feedback

Tencent Cloud AI Digital Human

Video Production API - Advanced Version

Last updated: 2024-07-18 18:21:08

API Description

Use ssml text and digital human for video production. The final video and subtitle file are returned through the Audio and video production progress query API. The advanced version of the API builds on the original by adding new resource parameters and expanding micro-editing capabilities. The supported features are detailed in the following table:
LOGO
Supporting adjusting the logo position (customizable X and Y axes)
Supporting adjusting the logo size (scaling ratio)
Specifying a video background
In MP4 video files, a background image can be specified.
Anchors
Horizontal position adjustment
Anchor size adjustment
Anchor angle adjustment
Note:
Only 3D avatars supported
Intro/outro video
A remote address to add video intro and outro can be specified.
Embedded subtitles
Turning embedded subtitles on or off

Calling Protocol

HTTPS + JSON
POST /v2/ivh/videomaker/broadcastservice/videomakeadvanced
Header Content-Type: application/json;charset=utf-8

Request Parameters

Parameters
Type
Mandatory
Description
VirtualmanKey
string
Yes
Define the broadcasting role, clothing, pose, resolution, etc. The parameter is an enumerated value.
InputSsml
string
No
The text content to be broadcast supports SSML tags. Refer to the Digital Human SSML Markup Language Specification for supported tag types and examples for tag usage. The content must not include line breaks, and symbols must be escaped. The upper limit is 20,000 characters (counted as Unicode characters). This field is required when DriverType is empty or set to Text.
SpeechParam
object
Yes
Define the detailed parameters of the audio.
SpeechParam.Speed
float
Yes
The speech rate (1.0 is normal speed, range [0.5-1.5]. A value of 0.5 indicates the slowest speed and a value of 1.5 indicates the fastest speed. Speech rate control is not effective when DriverType is set to audio-driven type).
SpeechParam.TimbreKey
string
No
Timbre key, and the avatar's own timbre is used by default.
SpeechParam.Volume
int
No
Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume.
SpeechParam.EmotionCategory
string
No
Control the emotion of synthetic audio, and only multi-emotion timbres are supported for the use. See the Personal Asset Management API Paginated Query Timbre List API for optional values.
SpeechParam.EmotionIntensity
int
No
Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty.
VideoParam
object
No
Define the detailed parameters for video synthesis.
VideoParam.Format
string
No
Video output format; default value: MP4
TransparentWebm: Transparent background WebM format video, supporting some micro-editing capabilities (anchor parameters supported)
GreenScreenMP4: Green screen MP4 format video, not supporting micro-editing capabilities
MP4: MP4 format video supporting micro-editing capabilities
VideoParam.BackgroundFileUrl
string
No
Video background image/video download path, supporting jpg, png, and MP4 formats. The image/video resolution must match the video resolution. If not provided, the default is a green-screen video. The file size limit is 500 MB.
VideoParam.VideoHeadFileUrl
string
No
Intro video which supports MP4 format. The resolution must match the video resolution, with a file size limit of 500 MB.
VideoParam.VideoTailFileUrl
string
No
Outro video which supports MP4 format. The resolution must match the video resolution, with a file size limit of 500 MB.
VideoParam.ShowSubtitles
boolean
No
Whether to display subtitles in the video. By default, subtitles are not displayed. Enabling subtitles will significantly increase the video production time.
VideoParam.SubtitlesParam
object
No
Define parameters for how subtitles are displayed in the video.
VideoParam.SubtitlesParam.MaxWords
int
No
The upper limit of characters displayed per page of subtitles, with a range from 0 to 999. The default value is 0. Default display rule: subtitles are shown within 80% of the video width; if exceeded, the text is paginated.
VideoParam.SubtitlesParam.DisplayPunctuation
string
No
Punctuation marks to be displayed in the subtitles. The special character "0" indicates no punctuation will be displayed, while "1" (the default value) indicates all punctuation will be displayed. You can also customize which punctuation marks to display by specifying them.
VideoParam.SubtitlesParam.SplitPunctuation
string
No
Punctuation marks that require subtitles to paginate, with the default values being: . ; ? ! ... !?
VideoParam.LogoParams
Array of [LogoParam]
No
Define parameters related to the logo in the video.
VideoParam.SmartActionEnabled
bool
No
Whether to enable intelligent actions. The default is disabled. Effective conditions: DriverType=Text and InputSsml does not contain action tags.
VideoParam.AnchorParam
object
No
Define parameters related to the anchor in the video.
VideoParam.AnchorParam.HorizontalPosition
float
No
Define the anchor's horizontal position (0 is the center). The Tag effect varies for different anchors:
Basic: Supporting left and right movements
Standard: Supporting left and right movements
Advanced: Not supporting left and right movement; any value will be treated as 0.
Note:
You can query the tag value through the Query Image Asset Information - Query Anchor API.
VideoParam.AnchorParam.VerticalPosition
float
No
Define the anchor's vertical position (0 is the center). The Tag effect varies for different anchors.
Basic: Supports upward and downward movements
Standard: Only supports downward movement (>=0). If a value less than 0 is provided, it defaults to 0.
Advanced: Only supports downward movement (>=0). If a value less than 0 is provided, it defaults to 0.
Note:
You can query the tag value through the Query Image Asset Information - Query Anchor API.
VideoParam.AnchorParam.Scale
float
No
Anchor size (1 is the default size, range (0,1]). Anchors with the Basic tag can have a size greater than 1. Avatar fall under Basic.
VideoParam.AnchorParam.Angle
int
No
Anchor angle (default is 0 degrees, and the range is [0,360]). The effect varies based on the anchor's tag.
Basic: Supported only for 3D anchors
Standard: Not supported
Advanced: Not supported
VideoParam.AnchorParam.AnchorExtraParam
string
No
Additional configurable parameters for anchors currently only include the 3D clothing color change parameters. The parameters that can be configured vary by anchor. For details, see the SupportAnchorExtraParam parameter in the Query Image Asset Information - Query all images of the Anchor API;
This parameter should be organized in JSON string format. See the request sample to organize the JSON.
VideoParam.SmallSampleParam
object
No
Define special parameters related to avatars. This parameter is not effective for non-avatars.
VideoParam.SmallSampleParam.MakeType
string
No
Define Avatar production type:
Default: The default configuration. Production will starts with a random starting frame. StartIdx and EndIdx do not take effect.
Custom: Specify a video segment by filling in StartIdx and EndIdx to select the segment. The default generated video will loop back and forth using this "specified video segment".
Circle: The starting and ending frames align. You can fill in StartIdx to specify the starting frame of the video (EndIdx is not effective). Note: This type is only used for audio-driven videos.
VideoParam.SmallSampleParam.StartIdx
int
No
Starting frame number, effective when MakeType is set to Custom or Circle. If it is filled in, the generated video will start from this frame; if it is not filled in, the video will start from the default frame number, which is frame 0.
VideoParam.SmallSampleParam.EndIdx
int
No
Ending frame number, effective when MakeType is set to Custom. If it is specified, the generated video will end at this frame; if it is not specified, the video will end at the default frame number, which is the end of the selected video segment.
CallbackUrl
string
No
When the user adds a callback URL, the video production results will be sent in a fixed format as a POST request to that URL. For the fixed format, see Appendix II: Callingback Request Format.
Note:
1. The InvocationbackUrl length must be less than 1000 characters.
2. Only one request will be sent. Regardless of the issue causing the request to fail, it cannot be resent.
DriverType
string
No
Driver type, and it is Text by default.
1. Text: It is text-driven and requires the InputSsml field to be filled.
2. OriginalVoice: It is original voice audio-driven and requires the InputAudioUrl field to be filled.
3. ModulatedVoice: It is modulated voice audio-driven and can specify timbre using Speech.TimbreKey. If not specified, the anchor's default timbre will be used.
InputAudioUrl
string
No
The audio URL to drive the digital human. This field is required when DriverType is OriginalVoice or ModulatedVoice. Audio format requirements:
1. For avatars, the duration should not exceed 60 minutes and not be less than 0.5 seconds. For non-avatars, the duration should not exceed 10 minutes and not be less than 0.5 seconds.
2. Supported formats: WAV, MP3, WMA, M4A, and AAC.
VideoStorageS3Url
string
No
A URL with an authenticated S3 protocol for storage can be provided, and the finished video will be uploaded to that URL.
SubtitleStorageS3Url
string
No
A URL with an authenticated S3 protocol for storage can be provided, and the finished subtitles will be uploaded to that URL.
ConcurrencyType
string
No
The concurrency type used for video production tasks defaults to prioritizing dedicated concurrency, followed by shared policies.
1. Exclusive: Dedicated concurrency. If no dedicated concurrency is available, the task submission will fail.
2. Shared: Shared concurrency

LogoParam

Parameters
Type
Mandatory
Description
LogoFileUrl
string
No
Logo image file download path, supporting jpg and png formats.
PositionX
int
No
X-coordinate of the logo image's top-left corner (coordinate range depends on the video resolution)
PositionY
int
No
Y-coordinate of the logo image's top-left corner (coordinate range depends on the video resolution)
Scale
float
No
Logo image scaling ratio (1.0 represents the original image size.)

VideoParam.Anchor.AnchorExtraParam example

Note:
ColorValue conversion rule: Convert the hexadecimal color value to decimal after removing the # symbol, e.g. #FAFAFA => 16448250.
{
"shirtColor": {
"colorValue": 10491928
}
"clotheColor": {
"colorValue": 10491928
}
"shoeColor": {
"colorValue": 10491928
}
}

Response Parameter

Parameters
Type
Mandatory
Description
TaskId
string
Yes
The video production task ID. Use the taskId to access the Audio and Video Production Progress Query API to obtain the production progress and results.

Request Sample

{
"Header": {},
"Payload":{
"VirtualmanKey": "123",
"InputSsml": "Hello, I am the virtual <phoneme alphabet=\\"py\\" ph=\\"fu4\\">anchor</phoneme>",
"SpeechParam": {
"Speed": 1
}
"VideoParam": {
"Format": "Mp4",
"BackgroundFileUrl": "url1",
"VideoHeadFileUrl": "url2",
"VideoTailFileUrl": "url3",
"ShowSubtitles": true,
"LogoParams": [
{
"LogoFileUrl": "http://virtualhuman-cos-test-1251316161.cos.ap-nanjing.myqcloud.com/virtualhuman-cos-test-1251316161/1554000251793182720",
"PositionX": 1561,
"PositionY": 751,
"Scale": 1.0
}
],
"AnchorParam": {
"HorizontalPosition": 0.5,
"Scale": 1.0,
"AnchorExtraParam": "\\"shirtColor\\": {\\"colorValue\\": 10491928},\\"clotheColor\\": {\\"colorValue\\": 10491928},\\"shoeColor\\": {\\"colorValue\\": 10491928}"
}
}
}
}

Response Sample

{
   "Header": {
"Code":0,
"DialogID":"",
"Message":"",
"RequestID":"123",
    }
"Payload":{
"TaskID":"123"
}
}
 
Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon