tencent cloud

$0 14-Day TrialExperience EdgeOne for acceleration and security protection!

Feedback

Tencent Cloud AI Digital Human

Video Production API - Basic Edition

Last updated: 2024-07-18 18:20:16

API Description

Use the SSML text and the digital human for video production. The final product video and subtitle file are returned through the Audio and Video Production Progress Query API.
Note:
Defining advanced parameters like anchor position is not supported. To use these features, switch to the Video Production API - Advanced Edition.

Calling Protocol

HTTPS + JSON
POST     /v2/ivh/videomaker/broadcastservice/videomake
Header Content-Type: application/json;charset=utf-8

Request Parameters

Parameters
Type
Mandatory
Description
VirtualmanKey
string
Yes
Define the broadcasting role, clothing, pose, and resolution. The parameter is an enumerated value.
InputSsml
string
Yes
The text content to be broadcast supports SSML tags. Refer to the Digital Human SSML Markup Language Specification for supported tag types and examples for tag usage. The content must not include line breaks, and symbols must be escaped. The upper limit is 20,000 characters (counted as Unicode characters). This field is required if DriverType is empty or set to Text.
SpeechParam
object
Yes
Define the detailed parameters of the audio.
SpeechParam.Speed
float
Yes
The speech rate (1.0 is normal speed, range [0.5-1.5]. A value of 0.5 indicates the slowest speed and a value of 1.5 indicates the fastest speed. Speech rate control is not effective when DriverType is set to audio-driven type).
SpeechParam.TimbreKey
string
No
Timbre key, and the avatar's own timbre is used by default.
SpeechParam.Volume
int
No
Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume.
SpeechParam.EmotionCategory
string
No
Controls the emotion of the synthesized audio, supported only for multi-emotion timbres. See the Personal Asset Management API Paginated Query Timbre List for available values.
SpeechParam.EmotionIntensity
int
No
Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty.
VideoParam
object
No
Define the detailed parameters for video synthesis.
VideoParam.Format
string
No
Video output format, and it is TransparentWebm by default.
TransparentWebm: WebM format video with a transparent background
GreenScreenMp4: MP4 format video with a green screen background
CallbackUrl
string
No
When the user adds a callback URL, the video production results will be sent as a POST request in a fixed format to that URL. For the fixed format, see Appendix II: Invocationback Request Body Format. Note:
1. The InvocationbackUrl length must be less than 1000 characters.
2. Only one request can be sent. Regardless of the issue causing the request to fail, it cannot be resent.
DriverType
string
No
Driver type, and it is Text by default.
1. Text: It is text-driven and requires the InputSsml field to be filled.
2. OriginalVoice: It is original voice audio-driven and requires the InputAudioUrl field to be filled.
3. ModulatedVoice: It is modulated voice audio-driven and can specify timbre using Speech.TimbreKey. If not specified, the anchor's default timbre will be used.
InputAudioUrl
string
No
The audio URL to drive the digital human. This field is required when DriverType is OriginalVoice or ModulatedVoice.
Audio format requirements:
1. For small sample avatars, the duration should not exceed 60 minutes and not be less than 0.5 seconds. For non-avatars, the duration should not exceed 10 minutes and not be less than 0.5 seconds.
2. Supported formats: WAV, MP3, WMA, M4A, and AAC.
VideoStorageS3Url
string
No
A URL with an authenticated S3 protocol for storage can be provided, and the finished video will be uploaded to that URL.
SubtitleStorageS3Url
string
No
A URL with an authenticated S3 protocol for storage can be provided, and the finished subtitles will be uploaded to that URL.
ConcurrencyType
string
No
The concurrency type used for video production tasks defaults to prioritizing dedicated concurrency, followed by shared policies.
1. Exclusive: Dedicated concurrency. If no dedicated concurrency is available, the task submission will fail.
2. Shared: Shared concurrency

Response Parameter

Parameters
Type
Mandatory
Description
TaskId
string
Yes
The video production task ID. Use the TaskId to access the Audio and Video Production Progress Query API to obtain the production progress and results.

Request Sample

{
"Header": {},
"Payload": {
"VirtualmanKey": "123",
"InputSsml": "Hello, I am the virtual <phoneme alphabet=\\"py\\" ph=\\"fu4\\">anchor</phoneme>",
"SpeechParam": {
"Speed": 1.0
},
"VideoParam": {
"Format": "GreenScreenMp4"
}
}
}

Response Sample

{
"Header": {
"Code": 0,
"DialogID": "",
"Message": "",
"RequestID": "123",
},
"Payload": {
"TaskId": "123"
}
}
 
 
Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon