tencent cloud

14天试用边缘安全加速平台 EO 限时免费

Feedback

Tencent Cloud AI Digital Human

Audio Production API

Last updated: 2024-07-18 18:18:42

API Description

To preview the input text, you can query the timbre to be previewed through the Query Supported Timbres for VirtualmanKey API. Some avatars do not support changing the timbre.

Calling Protocol

HTTPS + JSON
POST     /v2/ivh/videomaker/broadcastservice/tts
Header Content-Type: application/json;charset=utf-8

Request Parameters

Parameters
Type
Mandatory
Description
TimbreKey
string
No
Timbre key. When VirtualmanKey is empty, TimbreKey cannot be empty.
VirtualmanKey
string
No
Define information such as the role, clothing, pose, and resolution for the broadcast. The parameter is an enumerated value. When TimbreKey is empty, VirtualmanKey cannot be empty. By default, the first matching timbre for the avatar will be selected to produce the audio.
InputSsml
string
Yes
Text content to be broadcast which supports SSML tags with an upper limit of 20,000 characters (counted as Unicode characters)
Speed
float
Yes
Speech rate (1.0 is normal speed, with a range of [0.5 to 1.5]. A value of 0.5 represents the slowest speed, while 1.5 represents the fastest speed).
AudioStorageS3Url
string
No
A URL with an authenticated S3 protocol for storage can be provided, and the finished audio will be uploaded to that URL.
SampleRate
int
No
Sample rate which supports 24000 (24k) and 16000 (16k) with 24000 (24k) as the default
Codec
string
No
Audio format which supports mp3 and wav with mp3 as the default
SentenceMaxWords
int
No
The upper limit number of characters per sentence which ranges from 0 to 999. If 0 or nothing is provided, the default value is 30.
SentenceDisplayPunctuation
string
No
Punctuation marks to be displayed within sentences. Special character "0" means that no punctuation marks will be displayed, while "1" (default value) means that all punctuation marks will be displayed. You can also customize which punctuation marks to display.
SentenceSplitPunctuation
string
No
Punctuation marks for sentence segmentation with the default values being . ; ?! ... !?
Volume
int
No
Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume.
EmotionCategory
string
No
Control the emotion of synthetic audio, and only multi-emotion timbres are supported for the use. See the Personal Asset Management API 4.5 Timbre List API for optional values.
EmotionIntensity
int
No
Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty.

Response Parameter

Parameters
Type
Mandatory
Description
TaskId
string
Yes
The task ID for audio production. Use the taskId to access the <Audio and Video Production Progress Query API> to obtain the production progress and download link for the video.

Request Sample

{
"Header": {},
"Payload": {
"VirtualmanKey": "123",
"InputSsml": "Hello, virtual anchor",
"Speed": 1
}
}

Response Sample

{
"Header": {
"Code": 0,
"DialogID": "",
"Message": "",
"RequestID": "123"
},
"Payload": {
"TaskId": "123"
}
}
 
 
 
Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
中国香港
+852 800 906 020 (免费)
美国
+1 844 606 0804 (免费)
英国
+44 808 196 4551 (免费)
加拿大
+1 888 605 7930 (免费)
澳大利亚
+61 1300 986 386 (免费)
EdgeOne 热线
+852 300 80699
更多本地服务热线陆续新增中