tencent cloud

All product documents
APIs
Submitting Speech Recognition Job
Last updated: 2024-02-18 15:49:30
Submitting Speech Recognition Job
Last updated: 2024-02-18 15:49:30

Feature Description

This API (CreateSpeechJobs) is used to submit a speech recognition job.

Request

Sample request

POST /asr_jobs HTTP/1.1
Host: <BucketName-APPID>.ci.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Content-Length: <length>
Content-Type: application/xml

<body>
Note:
Authorization: Auth String (for more information, see Request Signature).
When this feature is used by a sub-account, relevant permissions must be granted. For more information, see Authorization Granularity.

Request headers

This API only uses common request headers. For more information, see Common Request Headers.

Request body

This request requires the following request body:
<Request>
<Tag>SpeechRecognition</Tag>
<Input>
<Object></Object>
</Input>
<Operation>
<SpeechRecognition></SpeechRecognition>
<Output>
<Region></Region>
<Bucket></Bucket>
<Object></Object>
</Output>
</Operation>
<QueueId></QueueId>
</Request>
The nodes are as described below:
Node Name (Keyword)
Parent Node
Description
Type
Required
Request
None
Request container
Container
Yes

Request has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Tag
Request
Job type, which currently can only be `SpeechRecognition`.
String
Yes
Input
Request
Speech file to be manipulated
Container
Yes
Operation
Request
Operation rule
Container
Yes
QueueId
Request
ID of the queue which the job is in
String
Yes

Input has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Object
Request.Input
Speech file key in COS. `Bucket` is specified by `Host`.
String
Yes

Operation has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
SpeechRecognition
Request.Operation
Job type parameter, which takes effect only if Tag is SpeechRecognition.
Container
No
Output
Request.Operation
Result output address
Container
Yes

SpeechRecognition has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
EngineModelType
Request.Operation.Speech Recognition
Engine model type. Phone call scenarios: • 8k_zh: 8 kHz, for Mandarin in general scenarios (available for dual-channel audio). • 8k_zh_s: 8 kHz, for Mandarin with speaker separation (available for mono-channel audio only). Non-phone call scenarios: • 16k_zh: 16 kHz, for Mandarin in general scenarios. • 16k_zh_video: 16 kHz, for Mandarin in audio/video scenarios. • 16k_en: 16 kHz, for English. • 16k_ca: 16 kHz, for Cantonese.
String
Yes
ChannelNum
Request.Operation.Speech Recognition
Number of speech sound channels. 1: mono; 2: dual (for the 8k_zh engine only).
Integer
Yes
ResTextFormat
Request.Operation.Speech Recognition
Format of the returned recognition result. 0: recognition result text, including the list of segment timestamps; 1: recognition result details, including the list of word timestamps (generally used to generate subtitles and for the 16k Mandarin engine only).
Integer
Yes
FilterDirty
Request.Operation.Speech Recognition
Whether to filter restricted words (for the Mandarin engine only). 0 (default value): does not filter; 1: filters; 2: replaces restricted words with "*".
Integer
No
FilterModal
Request.Operation.Speech Recognition
Whether to filter interjections (for the Mandarin engine only). 0 (default value): does not filter; 1: filters; 2: filters strictly.
Integer
No
ConvertNumMode
Request.Operation.Speech Recognition
Whether to intelligently convert Chinese numbers to Arabic numerals (for the Mandarin engine only). 0: directly outputs Chinese numbers; 1 (default value): intelligently converts based on the scenario.
Integer
No

Output has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Region
Request.Operation.Output
Bucket region
String
Yes
Bucket
Request.Operation.Output
Result storage bucket
String
Yes
Object
Request.Operation.Output
Result filename
String
Yes

Response

Response headers

This API only returns common response headers. For more information, see Common Response Headers.

Response body

The response body returns application/xml data. The following contains all the nodes:
<Response>
<JobsDetail>
<Code></Code>
<Message></Message>
<JobId></JobId>
<State></State>
<CreationTime></CreationTime>
<QueueId></QueueId>
<Tag><Tag>
<Input>
<Object></Object>
</Input>
<Operation>
<SpeechRecognition></SpeechRecognition>
<Output>
<Region></Region>
<Bucket></Bucket>
<Object></Object>
</Output>
<MediaInfo>
</MeidaInfo>
</Operation>
</JobsDetail>
</Response>
The nodes are as described below:
Node Name (Keyword)
Parent Node
Description
Type
Response
None
Response container
Container
Response has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
JobsDetail
Response
Job details
Container
JobsDetail has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Code
Response.JobsDetail
Error code, which is meaningful only if State is Failed
String
Message
Response.JobsDetail
Error description, which is meaningful only if State is Failed
String
JobId
Response.JobsDetail
Job ID
String
Tag
Response.JobsDetail
Job type: SpeechRecognition
String
State
Response.JobsDetail
Job status. Valid values: Submitted, Running, Success, Failed, Pause, Cancel
String
CreationTime
Response.JobsDetail
Job creation time
String
QueueId
Response.JobsDetail
ID of the queue which the job is in
String
Input
Response.JobsDetail
Input resource address of the job
Container
Operation
Response.JobsDetail
Operation rule
Container
Input has the following sub-nodes: Same as the Request.Input node in the request.
Operation has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
TemplateId
Response.JobsDetail.Operation
Job template ID
String
Output
Response.JobsDetail.Operation
File output address
Container
MediaInfo
Response.JobsDetail.Operation
Transcoding output video information. This node will not be returned if there is no output video.
Container
Output has the following sub-nodes: Same as the Request.Operation.Output node in the request.
SpeechRecognition has the following sub-nodes: Same as the Request.Operation.SpeechRecognition node in the request.

Error codes

There are no special error messages for this request. For common error messages, see Error Codes.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon