LOGO | Supporting adjusting the logo position (customizable X and Y axes) |
Supporting adjusting the logo size (scaling ratio) | |
Specifying a video background | In MP4 video files, a background image can be specified. |
Anchors | Horizontal position adjustment |
Anchor size adjustment | |
Anchor angle adjustment Note: Only 3D avatars supported | |
Intro/outro video | A remote address to add video intro and outro can be specified. |
Embedded subtitles | Turning embedded subtitles on or off |
Parameters | Type | Mandatory | Description |
VirtualmanKey | string | Yes | Define the broadcasting role, clothing, pose, resolution, etc. The parameter is an enumerated value. Note: |
InputSsml | string | No | The text content to be broadcast supports SSML tags. Refer to the Digital Human SSML Markup Language Specification for supported tag types and examples for tag usage. The content must not include line breaks, and symbols must be escaped. The upper limit is 20,000 characters (counted as Unicode characters). This field is required when DriverType is empty or set to Text. |
SpeechParam | object | Yes | Define the detailed parameters of the audio. |
SpeechParam.Speed | float | Yes | The speech rate (1.0 is normal speed, range [0.5-1.5]. A value of 0.5 indicates the slowest speed and a value of 1.5 indicates the fastest speed. Speech rate control is not effective when DriverType is set to audio-driven type). |
SpeechParam.TimbreKey | string | No | Timbre key, and the avatar's own timbre is used by default. |
SpeechParam.Volume | int | No | Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume. |
SpeechParam.EmotionCategory | string | No | Control the emotion of synthetic audio, and only multi-emotion timbres are supported for the use. See the Personal Asset Management API Paginated Query Timbre List API for optional values. |
SpeechParam.EmotionIntensity | int | No | Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty. |
VideoParam | object | No | Define the detailed parameters for video synthesis. |
VideoParam.Format | string | No | Video output format; default value: MP4 TransparentWebm: Transparent background WebM format video, supporting some micro-editing capabilities (anchor parameters supported) GreenScreenMP4: Green screen MP4 format video, not supporting micro-editing capabilities MP4: MP4 format video supporting micro-editing capabilities |
VideoParam.BackgroundFileUrl | string | No | Video background image/video download path, supporting jpg, png, and MP4 formats. The image/video resolution must match the video resolution. If not provided, the default is a green-screen video. The file size limit is 500 MB. |
VideoParam.VideoHeadFileUrl | string | No | Intro video which supports MP4 format. The resolution must match the video resolution, with a file size limit of 500 MB. |
VideoParam.VideoTailFileUrl | string | No | Outro video which supports MP4 format. The resolution must match the video resolution, with a file size limit of 500 MB. |
VideoParam.ShowSubtitles | boolean | No | Whether to display subtitles in the video. By default, subtitles are not displayed. Enabling subtitles will significantly increase the video production time. |
VideoParam.SubtitlesParam | object | No | Define parameters for how subtitles are displayed in the video. |
VideoParam.SubtitlesParam.MaxWords | int | No | The upper limit of characters displayed per page of subtitles, with a range from 0 to 999. The default value is 0. Default display rule: subtitles are shown within 80% of the video width; if exceeded, the text is paginated. |
VideoParam.SubtitlesParam.DisplayPunctuation | string | No | Punctuation marks to be displayed in the subtitles. The special character "0" indicates no punctuation will be displayed, while "1" (the default value) indicates all punctuation will be displayed. You can also customize which punctuation marks to display by specifying them. |
VideoParam.SubtitlesParam.SplitPunctuation | string | No | Punctuation marks that require subtitles to paginate, with the default values being: . ; ? ! ... !? |
VideoParam.LogoParams | No | Define parameters related to the logo in the video. | |
VideoParam.SmartActionEnabled | bool | No | Whether to enable intelligent actions. The default is disabled. Effective conditions: DriverType=Text and InputSsml does not contain action tags. |
VideoParam.AnchorParam | object | No | Define parameters related to the anchor in the video. |
VideoParam.AnchorParam.HorizontalPosition | float | No | Define the anchor's horizontal position (0 is the center). The Tag effect varies for different anchors: Basic: Supporting left and right movements Standard: Supporting left and right movements Advanced: Not supporting left and right movement; any value will be treated as 0. Note: |
VideoParam.AnchorParam.VerticalPosition | float | No | Define the anchor's vertical position (0 is the center). The Tag effect varies for different anchors. Basic: Supports upward and downward movements Standard: Only supports downward movement (>=0). If a value less than 0 is provided, it defaults to 0. Advanced: Only supports downward movement (>=0). If a value less than 0 is provided, it defaults to 0. Note: |
VideoParam.AnchorParam.Scale | float | No | Anchor size (1 is the default size, range (0,1]). Anchors with the Basic tag can have a size greater than 1. Avatar fall under Basic. |
VideoParam.AnchorParam.Angle | int | No | Anchor angle (default is 0 degrees, and the range is [0,360]). The effect varies based on the anchor's tag. Basic: Supported only for 3D anchors Standard: Not supported Advanced: Not supported |
VideoParam.AnchorParam.AnchorExtraParam | string | No | Additional configurable parameters for anchors currently only include the 3D clothing color change parameters. The parameters that can be configured vary by anchor. For details, see the SupportAnchorExtraParam parameter in the Query Image Asset Information - Query all images of the Anchor API; This parameter should be organized in JSON string format. See the request sample to organize the JSON. |
VideoParam.SmallSampleParam | object | No | Define special parameters related to avatars. This parameter is not effective for non-avatars. |
VideoParam.SmallSampleParam.MakeType | string | No | Define Avatar production type: Default: The default configuration. Production will starts with a random starting frame. StartIdx and EndIdx do not take effect. Custom: Specify a video segment by filling in StartIdx and EndIdx to select the segment. The default generated video will loop back and forth using this "specified video segment". Circle: The starting and ending frames align. You can fill in StartIdx to specify the starting frame of the video (EndIdx is not effective). Note: This type is only used for audio-driven videos. |
VideoParam.SmallSampleParam.StartIdx | int | No | Starting frame number, effective when MakeType is set to Custom or Circle. If it is filled in, the generated video will start from this frame; if it is not filled in, the video will start from the default frame number, which is frame 0. |
VideoParam.SmallSampleParam.EndIdx | int | No | Ending frame number, effective when MakeType is set to Custom. If it is specified, the generated video will end at this frame; if it is not specified, the video will end at the default frame number, which is the end of the selected video segment. |
CallbackUrl | string | No | When the user adds a callback URL, the video production results will be sent in a fixed format as a POST request to that URL. For the fixed format, see Appendix II: Callingback Request Format. Note: 1. The InvocationbackUrl length must be less than 1000 characters. 2. Only one request will be sent. Regardless of the issue causing the request to fail, it cannot be resent. |
DriverType | string | No | Driver type, and it is Text by default. 1. Text: It is text-driven and requires the InputSsml field to be filled. 2. OriginalVoice: It is original voice audio-driven and requires the InputAudioUrl field to be filled. 3. ModulatedVoice: It is modulated voice audio-driven and can specify timbre using Speech.TimbreKey. If not specified, the anchor's default timbre will be used. |
InputAudioUrl | string | No | The audio URL to drive the digital human. This field is required when DriverType is OriginalVoice or ModulatedVoice. Audio format requirements: 1. For avatars, the duration should not exceed 60 minutes and not be less than 0.5 seconds. For non-avatars, the duration should not exceed 10 minutes and not be less than 0.5 seconds. 2. Supported formats: WAV, MP3, WMA, M4A, and AAC. |
VideoStorageS3Url | string | No | A URL with an authenticated S3 protocol for storage can be provided, and the finished video will be uploaded to that URL. |
SubtitleStorageS3Url | string | No | A URL with an authenticated S3 protocol for storage can be provided, and the finished subtitles will be uploaded to that URL. |
ConcurrencyType | string | No | The concurrency type used for video production tasks defaults to prioritizing dedicated concurrency, followed by shared policies. 1. Exclusive: Dedicated concurrency. If no dedicated concurrency is available, the task submission will fail. 2. Shared: Shared concurrency |
Parameters | Type | Mandatory | Description |
LogoFileUrl | string | No | Logo image file download path, supporting jpg and png formats. |
PositionX | int | No | X-coordinate of the logo image's top-left corner (coordinate range depends on the video resolution) |
PositionY | int | No | Y-coordinate of the logo image's top-left corner (coordinate range depends on the video resolution) |
Scale | float | No | Logo image scaling ratio (1.0 represents the original image size.) |
{"shirtColor": {"colorValue": 10491928}"clotheColor": {"colorValue": 10491928}"shoeColor": {"colorValue": 10491928}}
Parameters | Type | Mandatory | Description |
TaskId | string | Yes | The video production task ID. Use the taskId to access the Audio and Video Production Progress Query API to obtain the production progress and results. |
{"Header": {},"Payload":{"VirtualmanKey": "123","InputSsml": "Hello, I am the virtual <phoneme alphabet=\\"py\\" ph=\\"fu4\\">anchor</phoneme>","SpeechParam": {"Speed": 1}"VideoParam": {"Format": "Mp4","BackgroundFileUrl": "url1","VideoHeadFileUrl": "url2","VideoTailFileUrl": "url3","ShowSubtitles": true,"LogoParams": [{"LogoFileUrl": "http://virtualhuman-cos-test-1251316161.cos.ap-nanjing.myqcloud.com/virtualhuman-cos-test-1251316161/1554000251793182720","PositionX": 1561,"PositionY": 751,"Scale": 1.0}],"AnchorParam": {"HorizontalPosition": 0.5,"Scale": 1.0,"AnchorExtraParam": "\\"shirtColor\\": {\\"colorValue\\": 10491928},\\"clotheColor\\": {\\"colorValue\\": 10491928},\\"shoeColor\\": {\\"colorValue\\": 10491928}"}}}}
{"Header": {"Code":0,"DialogID":"","Message":"","RequestID":"123",}"Payload":{"TaskID":"123"}}
Was this page helpful?