tencent cloud

文档反馈

TextToVoice

最后更新时间:2024-12-20 15:45:11

    1. API Description

    Domain name for API request: tts.intl.tencentcloudapi.com.

    This API is used to convert any text to speech, allowing your devices and applications to talk to users.
    u200bTencent Cloud Text To Speech (TTS) can synthesize speech from text in real time for many use cases, such as audiobook and news apps, voice reminders on smart devices, quick synthesis of a celebrity's voice based on existing programs or certain voice records available on the internet, and personalized vehicle navigation systems.
    It is free for use in beta.
    It supports SSML. For syntax details, see SSML.
    Default API request rate limit: 20 requests/sec.

    We recommend you to use API Explorer
    Try it
    API Explorer provides a range of capabilities, including online call, signature authentication, SDK code generation, and API quick search. It enables you to view the request, response, and auto-generated examples.

    2. Input Parameters

    The following request parameter list only provides API request parameters and some common parameters. For the complete common parameter list, see Common Request Parameters.

    Parameter Name Required Type Description
    Action Yes String Common Params. The value used for this API: TextToVoice.
    Version Yes String Common Params. The value used for this API: 2019-08-23.
    Region Yes String Common Params. For more information, please see the list of regions supported by the product.
    Text Yes String The source text for synthesizing speech, which is encoded in UTF-8.
    It can contain up to 150 Chinese characters (a full-width punctuation as a Chinese character) or 500 letters ( a half-width punctuation as a letter).
    SessionId Yes String The SessionId of a request, which will be returned as-is. We recommend that you pass characters like uuid to prevent repetition.
    Volume No Float Volume range: [0, 10], corresponding to 11 volume levels. 0 is the default value, indicating the normal volume. There is no mute option.
    Speed No Float Speed range: [-2, 6], corresponding to different speeds
  • -2 for 0.6 times
  • -1 for 0.8 times
  • 0 for 1.0 time (default)
  • 1 for 1.2 times
  • 2 for 1.5 times
  • 6 for 2.5 times
  • To set finer-grained speed levels, keep one decimal place, such as 0.5, 1.1, and 1.8.
    ProjectId No Integer Project ID, which defaults to 0 and can be customized.
    ModelType No Integer Model type, with 1 for the default model.
    VoiceType No Integer Standard voices
  • 10510000-zhixiaoyao (Chinese)
  • 1001-zhiyu (Chinese)
  • 1002-zhiling (Chinese)
  • 1003-zhimei (Chinese)
  • 1004-zhiyun (Chinese)
  • 1005-zhili (Chinese)
  • 1007-zhina (Chinese)
  • 1008-zhiqi (Chinese)
  • 1009-zhiyun (Chinese)
  • 1010-zhihua (Chinese)
  • 1017-zhirong (Chinese)
  • 1018-zhijing (Chinese)
  • 1050-WeJack (English)
  • 1051-WeRose (English)
  • Premium voices
    Premium voices have higher fidelity and more natural-sounding quality than standard voices. For price details, see Purchase Guide.
  • 100510000-zhixiaoyao (Chinese)
  • 101001-zhiyu (Chinese)
  • 101002-zhiling (Chinese)
  • 101003-zhimei (Chinese)
  • 101004-zhiyun (Chinese)
  • 101005-zhili (Chinese)
  • 101006-zhiyan (Chinese)
  • 101007-zhina (Chinese)
  • 101008-zhiqi (Chinese)
  • 101009-zhiyun (Chinese)
  • 101010-zhihua (Chinese)
  • 101011-zhiyan (Chinese)
  • 101012-zhidan (Chinese)
  • 101013-zhihui (Chinese)
  • 101014-zhining (Chinese)
  • 101015-zhimeng (Chinese)
  • 101016-zhitian (Chinese)
  • 101017-zhirong (Chinese)
  • 101018-zhijing (Chinese)
  • 101019-zhitong (Cantonese)
  • 101020-zhigang (Chinese)
  • 101021-zhirui (Chinese)
  • 101022-zhihong (Chinese)
  • 101023-zhixuan (Chinese)
  • 101024-zhihao (Chinese)
  • 101025-zhiwei (Chinese)
  • 101026-zhixi (Chinese)
  • 101027-zhimei (Chinese)
  • 101028-zhijie (Chinese)
  • 101029-zhikai (Chinese)
  • 101030-zhike (Chinese)
  • 101031-zhikui (Chinese)
  • 101032-zhifang (Chinese)
  • 101033-zhibei (Chinese)
  • 101034-zhilian (Chinese)
  • 101035-zhiyi (Chinese)
  • 101040-zhichuan (Sichuan dialect)
  • 101050-WeJack (English)
  • 101051-WeRose (English)
  • 101052-zhiwei (Chinese)
  • 101053-zhifang (Chinese)
  • 101054-zhiyou (Chinese)
  • 101055-zhiyou (Chinese)
  • 101056-zhilin (Northeastern Mandarin)
  • PrimaryLanguage No Integer Primary language type:
  • 1 - Chinese (default)
  • 2 - English
  • SampleRate No Integer Audio sample rate:
  • 16000: 16k (default)
  • 8000: 8k
  • Codec No String Format of returned audio. Valid values: WAV (default), MP3, and PCM.
    EnableSubtitle No Boolean Whether to enable the timestamp feature. Default value: false.
    SegmentRate No Integer The threshold of speech segmentation sensibility, which can be 0 (default), 1, or 2. A larger value indicates fewer segments, and the model tends to only segment sentences based on punctuation marks. We recommend you not change this parameter to avoid adverse effect on speech synthesis.

    3. Output Parameters

    Parameter Name Type Description
    Audio String Base64-encoded WAV/MP3 audio data
    SessionId String The SessionId of a request
    Subtitles Array of Subtitle Timestamp information. If the timestamp feature is not enabled, an empty array will be returned.
    RequestId String The unique request ID, generated by the server, will be returned for every request (if the request fails to reach the server for other reasons, the request will not obtain a RequestId). RequestId is required for locating a problem.

    4. Example

    Example1 基础语音合成调用示例

    Input Example

    POST / HTTP/1.1
    Host: tts.intl.tencentcloudapi.com
    Content-Type: application/json
    X-TC-Action: TextToVoice
    <common request parameters>
    
    {
        "Text": "你好",
        "SessionId": "session-1234",
        "Volume": 1,
        "Speed": 1,
        "ProjectId": 0,
        "ModelType": 1,
        "VoiceType": 1001,
        "PrimaryLanguage": 1,
        "SampleRate": 16000,
        "Codec": "wav",
        "EnableSubtitle": true
    }
    

    Output Example

    {
        "Response": {
            "Audio": "UklGRqRwAABXQVZFZm10IBAAAAABAAEAgD4AAAB9AAACABAAZGF0YYBwAAAA......AAAAA=",
            "RequestId": "d91f1496-0514-4281-932e-15a022b67d16",
            "SessionId": "session-1234",
            "Subtitles": [
                {
                    "BeginIndex": 0,
                    "BeginTime": 250,
                    "EndIndex": 1,
                    "EndTime": 430,
                    "Phoneme": "ni2",
                    "Text": "你"
                },
                {
                    "BeginIndex": 1,
                    "BeginTime": 430,
                    "EndIndex": 2,
                    "EndTime": 670,
                    "Phoneme": "hao3",
                    "Text": "好"
                }
            ]
        }
    }
    

    5. Developer Resources

    SDK

    TencentCloud API 3.0 integrates SDKs that support various programming languages to make it easier for you to call APIs.

    Command Line Interface

    6. Error Code

    The following only lists the error codes related to the API business logic. For other error codes, see Common Error Codes.

    Error Code Description
    AuthFailure.InvalidAuthorization Invalid authorization.
    InternalError.ErrorGetRoute Invalid route.
    InternalError.ExceedMaxLimit Traffic is throttled due to high load.
    InternalError.InternalError Internal error.
    InternalError.NoResource
    InvalidParameter.InvalidText The request text contains invalid characters.
    InvalidParameterValue.AppId Invalid AppId. See the description of AppId.
    InvalidParameterValue.AppIdNotRegistered The APPID is not registered. Activate the service in the TTS console (https://console.tencentcloud.com/tts) first.
    InvalidParameterValue.Codec Invalid Codec. See the description of Codec.
    InvalidParameterValue.ErrorCardinalFormat The number part of the say-as tag of SSML is not a valid constant, which can only contain digits, ",", ".", and " " when the tag attribute is cardinal, currency, or address.
    InvalidParameterValue.InvalidText The request text contains invalid characters, or it contains no valid characters.
    InvalidParameterValue.MissParameters Parameter missing.
    InvalidParameterValue.ParticipleError Error in text segmentation.
    InvalidParameterValue.PrimaryLanguage Invalid PrimaryLanguage. See the description of PrimaryLanguage.
    InvalidParameterValue.SSMLInvalid Invalid SSML tag.
    InvalidParameterValue.SampleRate Invalid SampleRate. See the description of SampleRate.
    InvalidParameterValue.SessionId Invalid SessionId. See the description of SessionId.
    InvalidParameterValue.Speed Invalid Speed. See the description of Speed.
    InvalidParameterValue.Text Text missing.
    InvalidParameterValue.TextEmpty Empty text.
    InvalidParameterValue.TextNotUtf8 The text is not encoded in UTF8.
    InvalidParameterValue.Type Invalid Type.
    InvalidParameterValue.VoiceType Invalid VoiceType. See the description of VoiceType.
    InvalidParameterValue.Volume Invalid Volume. See the description of Volume.
    LimitExceeded.AccessLimit The request frequency exceeds the limit.
    UnsupportedOperation Unsupported operation.
    UnsupportedOperation.AccountArrears Overdue payment exists.
    UnsupportedOperation.AuthorizationExpired Authentication expired.
    UnsupportedOperation.AuthorizationFailed Authentication failed.
    UnsupportedOperation.ForbiddenUse Service prohibited.
    UnsupportedOperation.NoFreeAccount Free tier is used up.
    UnsupportedOperation.PkgExhausted The resource package is used up.
    UnsupportedOperation.ServerAlreadyOpen Server opened.
    UnsupportedOperation.ServerDestoryed The service is already terminated.
    UnsupportedOperation.ServerNotOpen Service inactivated.
    UnsupportedOperation.ServerStopped Service stopped.
    UnsupportedOperation.TextTooLong The text is too long. See the description of the request parameter Text.