tencent cloud

Feedback

Audio Production API

Last updated: 2024-07-18 18:18:42

    API Description

    To preview the input text, you can query the timbre to be previewed through the Query Supported Timbres for VirtualmanKey API. Some avatars do not support changing the timbre.

    Calling Protocol

    HTTPS + JSON
    POST     /v2/ivh/videomaker/broadcastservice/tts
    Header Content-Type: application/json;charset=utf-8

    Request Parameters

    Parameters
    Type
    Mandatory
    Description
    TimbreKey
    string
    No
    Timbre key. When VirtualmanKey is empty, TimbreKey cannot be empty.
    VirtualmanKey
    string
    No
    Define information such as the role, clothing, pose, and resolution for the broadcast. The parameter is an enumerated value. When TimbreKey is empty, VirtualmanKey cannot be empty. By default, the first matching timbre for the avatar will be selected to produce the audio.
    InputSsml
    string
    Yes
    Text content to be broadcast which supports SSML tags with an upper limit of 20,000 characters (counted as Unicode characters)
    Speed
    float
    Yes
    Speech rate (1.0 is normal speed, with a range of [0.5 to 1.5]. A value of 0.5 represents the slowest speed, while 1.5 represents the fastest speed).
    AudioStorageS3Url
    string
    No
    A URL with an authenticated S3 protocol for storage can be provided, and the finished audio will be uploaded to that URL.
    SampleRate
    int
    No
    Sample rate which supports 24000 (24k) and 16000 (16k) with 24000 (24k) as the default
    Codec
    string
    No
    Audio format which supports mp3 and wav with mp3 as the default
    SentenceMaxWords
    int
    No
    The upper limit number of characters per sentence which ranges from 0 to 999. If 0 or nothing is provided, the default value is 30.
    SentenceDisplayPunctuation
    string
    No
    Punctuation marks to be displayed within sentences. Special character "0" means that no punctuation marks will be displayed, while "1" (default value) means that all punctuation marks will be displayed. You can also customize which punctuation marks to display.
    SentenceSplitPunctuation
    string
    No
    Punctuation marks for sentence segmentation with the default values being . ; ?! ... !?
    Volume
    int
    No
    Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume.
    EmotionCategory
    string
    No
    Control the emotion of synthetic audio, and only multi-emotion timbres are supported for the use. See the Personal Asset Management API 4.5 Timbre List API for optional values.
    EmotionIntensity
    int
    No
    Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty.

    Response Parameter

    Parameters
    Type
    Mandatory
    Description
    TaskId
    string
    Yes
    The task ID for audio production. Use the taskId to access the <Audio and Video Production Progress Query API> to obtain the production progress and download link for the video.

    Request Sample

    {
    "Header": {},
    "Payload": {
    "VirtualmanKey": "123",
    "InputSsml": "Hello, virtual anchor",
    "Speed": 1
    }
    }

    Response Sample

    {
    "Header": {
    "Code": 0,
    "DialogID": "",
    "Message": "",
    "RequestID": "123"
    },
    "Payload": {
    "TaskId": "123"
    }
    }
     
     
     
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support