Creating Automatic Speech Recognition Template

Recent Pages

Creating Automatic Speech Recognition Template

Terakhir diperbarui:2024-06-12 15:47:04

Feature Description
This API is used to create a speech recognition template.
﻿
Request
Sample request
POST /template HTTP/1.1
Host: <BucketName-APPID>.ci.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Content-Length: <length>
Content-Type: application/xml
﻿
<body>
Note: 
Authorization: Auth String (for more information, see Request Signature).
When this feature is used by a sub-account, relevant permissions must be granted.
Request headers
This API only uses common request headers. For more information, see Common Request Headers.
Request body
This request requires the following request body:
<Request>
    <Tag>SpeechRecognition</Tag>
    <Name>TemplateName</Name>
    <SpeechRecognition>
        <EngineModelType>16k_zh</EngineModelType>
        <ResTextFormat>1</ResTextFormat>
        <FilterDirty>0</FilterDirty>
        <FilterModal>1</FilterModal>
        <ConvertNumMode>0</ConvertNumMode>
        <SpeakerDiarization>1</SpeakerDiarization>
        <SpeakerNumber>0</SpeakerNumber>
        <FilterPunc>0</FilterPunc>
        <OutputFileType>txt</OutputFileType>
    </SpeechRecognition>
</Request>
The nodes are described as follows:
Node Name (Keyword)
Parent Node
Description
Type
Required
Request
None
Request container.
Container
Yes
﻿Request
 has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Constraints
Tag
Request
Template tag: SpeechRecognition.
String
Yes
No
Name
Request
Template name, which can contain letters, digits, underscores (_), hyphens (-), and asterisks (*).
String
Yes
None
SpeechRecognition
Request
Speech recognition parameter.
Container
Yes
None
﻿SpeechRecognition
 has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
EngineModelType
Request.Speech
Recognition
Engine model type, divided into phone call and non-phone call scenarios. 
Phone call scenarios: 
8k_zh: 8 kHz, for Mandarin in general scenarios (available for dual-channel audio). 
8k_zh_s: 8 kHz, for Mandarin with speaker separation (available for mono-channel audio only). 
8k_en: 8 kHz, for English. 
Non-phone call scenarios: 
16k_zh: 16 kHz, for Mandarin in general scenarios. 
16k_zh_video: 16 kHz, for audio/video scenarios. 
16k_en: 16 kHz, for English.
16k_ca: 16 kHz, for Cantonese. 
16k_ja: 16 kHz, for Japanese. 
16k_zh_edu: For Mandarin in education scenarios. 
16k_en_edu: For English in education scenarios. 
16k_zh_medical: For healthcare scenarios. 
16k_th: For Thai. 
16k_zh_dialect: Multi-dialect, for up to 23 dialects.
String
Yes
ChannelNum
Request.Speech
Recognition
Number of sound channels:
1: Mono. If EngineModelType is not the phone call scenario, only mono channel is supported. 
2: Dual (for the 8k_zh engine only, where the two channels correspond to the caller and callee respectively). 
Integer
Yes
ResTextFormat
Request.Speech
Recognition
Format of the returned recognition result. 
0: Recognition result text, including the list of segment timestamps. 
1: Detailed word-level recognition result, excluding punctuation marks but including the speech speed value (the list of word timestamps, generally used to generate subtitles). 
2: Detailed word-level recognition result, including punctuation marks and the speech speed value. 
Integer
Yes
FilterDirty
Request.Speech
Recognition
Whether to filter restricted words (for the Mandarin engine only). 
0: Does not filter. 
1: Filters. 
2: Replaces restricted words with *.
Default value: 0. 
Integer
No
FilterModal
Request.Speech
Recognition
Whether to filter modal particles (for the Mandarin engine only). 
0: Does not filter. 
1: Filters partially.
 2: Filters strictly. 
Default value: 0.
Integer
No
ConvertNumMode
Request.Speech
Recognition
Whether to intelligently convert Chinese numbers to Arabic numerals (for the Mandarin engine only): 
0: Directly outputs Chinese numbers. 
1: Intelligently converts based on the scenario. 
3: Enables mathematic number conversion. 
Default value: 0.
Integer
No
SpeakerDiarization
Request.Speech
Recognition
Whether to enable speaker separation: 
0: No.
1: Yes (for mono-channel audios with the 8k_zh, 16k_zh, or 16k_zh_video engine only). 
Default value: 0.
Note: In the 8 kHz phone call scenario, we recommend you use dual channels to distinguish between the caller and callee by setting ChannelNum=2, so you don't need to enable speaker separation. 
Integer
No
SpeakerNumber
Request.Speech
Recognition
Number of speakers to be separated (with speaker separation enabled). Value range: 0–10. 
0: Automatic separation (currently only for six or fewer people only). 1–10: Specified number of speakers to be separated. Default value: 0.
Integer
No
FilterPunc
Request.Speech
Recognition
Whether to filter punctuation marks (currently for the Mandarin engine only): 
0: Does not filter. 
1: Filters the punctuation mark at the end of the sentence. 
2: Filters all punctuation marks. 
Default value: 0. 
Integer
No
OutputFileType
Request.Speech
Recognition
Output file type. Valid values: txt, srt. Default value: txt.
String
No
Response
Response headers
This API only returns common response headers. For more information, see Common Response Headers.
Response body
The response body returns application/xml data. The following contains all the nodes:
<Response>
    <Template>
        <Tag>SpeechRecognition</Tag>
        <Name>TemplateName</Name>
        <State>Normal</State>
        <Tag>SpeechRecognition</Tag>
        <CreateTime></CreateTime>
        <UpdateTime></UpdateTime>
        <BucketId></BucketId>
        <Category>Custom</Category>
        <SpeechRecognition>
            <EngineModelType>16k_zh</EngineModelType>
            <ResTextFormat>1</ResTextFormat>
            <FilterDirty>0</FilterDirty>
            <FilterModal>1</FilterModal>
            <ConvertNumMode>0</ConvertNumMode>
            <SpeakerDiarization>1</SpeakerDiarization>
            <SpeakerNumber>0</SpeakerNumber>
            <FilterPunc>0</FilterPunc>
            <OutputFileType>txt</OutputFileType>
        </SpeechRecognition>
    </Template>
</Response>
The nodes are as described below:
Node Name (Keyword)
Parent Node
Description
Type
Response
None
Response container
Container
﻿Response
 has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
TemplateId
Response.Template
Template ID.
String
Name
Response.Template
Template name.
String
BucketId
Response.Template
Template bucket.
String
Category
Response.Template
Template category: Custom or Official.
String
Tag
Response.Template
Template tag: SpeechRecognition.
String
UpdateTime
Response.Template
Update time.
String
CreateTime
Response.Template
Creation time.
String
SpeechRecognition
Response.Template
Same as the Request.SpeechRecognition in the request body.
Container
Error codes
There are no special error messages for this request. For common error messages, see Error Codes.
Samples
Request
POST /template HTTP/1.1
Authorization: q-sign-algorithm=sha1&q-ak=AKIDZfbOAo7cllgPvF9cXFrJD0a1ICvR****&q-sign-time=1497530202;1497610202&q-key-time=1497530202;1497610202&q-header-list=&q-url-param-list=&q-signature=28e9a4986df11bed0255e97ff90500557e0e****
Host: test-1234567890.ci.ap-chongqing.myqcloud.com
Content-Length: 1666
Content-Type: application/xml
﻿
<Request>
    <Tag>SpeechRecognition</Tag>
    <Name>TemplateName</Name>
    <SpeechRecognition>
        <EngineModelType>16k_zh</EngineModelType>
        <ResTextFormat>1</ResTextFormat>
        <FilterDirty>0</FilterDirty>
        <FilterModal>1</FilterModal>
        <ConvertNumMode>0</ConvertNumMode>
        <SpeakerDiarization>1</SpeakerDiarization>
        <SpeakerNumber>0</SpeakerNumber>
        <FilterPunc>0</FilterPunc>
        <OutputFileType>txt</OutputFileType>
    </SpeechRecognition>
</Request>
Response
HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: 100
Connection: keep-alive
Date: Thu, 14 Jul 2022 12:37:29 GMT
Server: tencent-ci
x-ci-request-id: NTk0MjdmODlfMjQ4OGY3XzYzYzhf****
﻿
<Response>
    <Template>
        <TemplateId>t1460606b9752148c4ab182f55163ba7cd</TemplateId>
        <Name>TemplateName</Name>
        <State>Normal</State>
        <Tag>SpeechRecognition</Tag>
        <CreateTime>2020-08-05T11:35:24+0800</CreateTime>
        <UpdateTime>2020-08-31T16:15:20+0800</UpdateTime>
        <BucketId>test-1234567890</BucketId>
        <Category>Custom</Category>
        <SpeechRecognition>
            <EngineModelType>16k_zh</EngineModelType>
            <ChannelNum>1</ChannelNum>
            <ResTextFormat>0</ResTextFormat>
            <FilterDirty>1</FilterDirty>
            <FilterModal>0</FilterModal>
            <ConvertNumMode>1</ConvertNumMode>
            <SpeakerDiarization>0</SpeakerDiarization>
            <SpeakerNumber>0</SpeakerNumber>
            <FilterPunc>0</FilterPunc>
        </SpeechRecognition>
    </Template>
</Response>
﻿

Hubungi Kami

Hubungi tim penjualan atau penasihat bisnis kami untuk membantu bisnis Anda.

Dukungan Teknis

Buka tiket jika Anda mencari bantuan lebih lanjut. Tiket kami tersedia 7x24.

Dukungan Telepon 7x24

tencent cloud

Recent Pages

Creating Automatic Speech Recognition Template

Feature Description

Request

Sample request

Request headers

Request body

Response

Response headers

Response body

Error codes

Samples

Request

Response

Apakah halaman ini membantu?

Apakah halaman ini membantu?

Node Name (Keyword)	Parent Node	Description	Type	Required
Request	None	Request container.	Container	Yes

tencent cloud

Daftar

Login

Recent Pages

Creating Automatic Speech Recognition Template

Feature Description

Request

Sample request

Request headers

Request body

Response

Response headers

Response body

Error codes

Samples

Request

Response

Apakah halaman ini membantu?

Apakah halaman ini membantu?