Parameters | Type | Mandatory | Description |
VirtualmanKey | string | Yes | Define the broadcasting role, clothing, pose, and resolution. The parameter is an enumerated value. Note: |
InputSsml | string | Yes | The text content to be broadcast supports SSML tags. Refer to the Digital Human SSML Markup Language Specification for supported tag types and examples for tag usage. The content must not include line breaks, and symbols must be escaped. The upper limit is 20,000 characters (counted as Unicode characters). This field is required if DriverType is empty or set to Text. |
SpeechParam | object | Yes | Define the detailed parameters of the audio. |
SpeechParam.Speed | float | Yes | The speech rate (1.0 is normal speed, range [0.5-1.5]. A value of 0.5 indicates the slowest speed and a value of 1.5 indicates the fastest speed. Speech rate control is not effective when DriverType is set to audio-driven type). |
SpeechParam.TimbreKey | string | No | Timbre key, and the avatar's own timbre is used by default. |
SpeechParam.Volume | int | No | Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume. |
SpeechParam.EmotionCategory | string | No | Controls the emotion of the synthesized audio, supported only for multi-emotion timbres. See the Personal Asset Management API Paginated Query Timbre List for available values. |
SpeechParam.EmotionIntensity | int | No | Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty. |
VideoParam | object | No | Define the detailed parameters for video synthesis. |
VideoParam.Format | string | No | Video output format, and it is TransparentWebm by default. TransparentWebm: WebM format video with a transparent background GreenScreenMp4: MP4 format video with a green screen background |
CallbackUrl | string | No | When the user adds a callback URL, the video production results will be sent as a POST request in a fixed format to that URL. For the fixed format, see Appendix II: Invocationback Request Body Format. Note: 1. The InvocationbackUrl length must be less than 1000 characters. 2. Only one request can be sent. Regardless of the issue causing the request to fail, it cannot be resent. |
DriverType | string | No | Driver type, and it is Text by default. 1. Text: It is text-driven and requires the InputSsml field to be filled. 2. OriginalVoice: It is original voice audio-driven and requires the InputAudioUrl field to be filled. 3. ModulatedVoice: It is modulated voice audio-driven and can specify timbre using Speech.TimbreKey. If not specified, the anchor's default timbre will be used. |
InputAudioUrl | string | No | The audio URL to drive the digital human. This field is required when DriverType is OriginalVoice or ModulatedVoice. Audio format requirements: 1. For small sample avatars, the duration should not exceed 60 minutes and not be less than 0.5 seconds. For non-avatars, the duration should not exceed 10 minutes and not be less than 0.5 seconds. 2. Supported formats: WAV, MP3, WMA, M4A, and AAC. |
VideoStorageS3Url | string | No | A URL with an authenticated S3 protocol for storage can be provided, and the finished video will be uploaded to that URL. |
SubtitleStorageS3Url | string | No | A URL with an authenticated S3 protocol for storage can be provided, and the finished subtitles will be uploaded to that URL. |
ConcurrencyType | string | No | The concurrency type used for video production tasks defaults to prioritizing dedicated concurrency, followed by shared policies. 1. Exclusive: Dedicated concurrency. If no dedicated concurrency is available, the task submission will fail. 2. Shared: Shared concurrency |
Parameters | Type | Mandatory | Description |
TaskId | string | Yes | The video production task ID. Use the TaskId to access the Audio and Video Production Progress Query API to obtain the production progress and results. |
{"Header": {},"Payload": {"VirtualmanKey": "123","InputSsml": "Hello, I am the virtual <phoneme alphabet=\\"py\\" ph=\\"fu4\\">anchor</phoneme>","SpeechParam": {"Speed": 1.0},"VideoParam": {"Format": "GreenScreenMp4"}}}
{"Header": {"Code": 0,"DialogID": "","Message": "","RequestID": "123",},"Payload": {"TaskId": "123"}}
Was this page helpful?