Submitting Speech Recognition Job

Feature Description
This API (CreateSpeechJobs) is used to submit a speech recognition job.
Request
Sample request
POST /asr_jobs HTTP/1.1
Host: <BucketName-APPID>.ci.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Content-Length: <length>
Content-Type: application/xml
﻿
<body>
Note: 
Authorization: Auth String (for more information, see Request Signature).
When this feature is used by a sub-account, relevant permissions must be granted. For more information, see Authorization Granularity.
Request headers
This API only uses common request headers. For more information, see Common Request Headers.
Request body
This request requires the following request body:
<Request>
  <Tag>SpeechRecognition</Tag>
  <Input>
    <Object></Object>
  </Input>
  <Operation>
    <SpeechRecognition></SpeechRecognition>
    <Output>
      <Region></Region>
      <Bucket></Bucket>
      <Object></Object>
    </Output>
  </Operation>
  <QueueId></QueueId>
</Request>
The nodes are as described below:
Node Name (Keyword)
Parent Node
Description
Type
Required
Request
None
Request container
Container
Yes
﻿
Request has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Tag
Request
Job type, which currently can only be `SpeechRecognition`.
String
Yes
Input
Request
Speech file to be manipulated
Container
Yes
Operation
Request
Operation rule
Container
Yes
QueueId
Request
ID of the queue which the job is in
String
Yes
﻿
Input has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Object
Request.Input
Speech file key in COS. `Bucket` is specified by `Host`.
String
Yes
﻿
Operation has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
SpeechRecognition
Request.Operation
Job type parameter, which takes effect only if Tag is SpeechRecognition.
Container
No
Output
Request.Operation
Result output address
Container
Yes
﻿
SpeechRecognition has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
EngineModelType
Request.Operation.Speech
Recognition
Engine model type.
Phone call scenarios:
• 8k_zh: 8 kHz, for Mandarin in general scenarios (available for dual-channel audio).
• 8k_zh_s: 8 kHz, for Mandarin with speaker separation (available for mono-channel audio only).
Non-phone call scenarios:
• 16k_zh: 16 kHz, for Mandarin in general scenarios.
• 16k_zh_video: 16 kHz, for Mandarin in audio/video scenarios.
• 16k_en: 16 kHz, for English.
• 16k_ca: 16 kHz, for Cantonese.
String
Yes
ChannelNum
Request.Operation.Speech
Recognition
Number of speech sound channels. 1: mono; 2: dual (for the 8k_zh engine only).
Integer
Yes
ResTextFormat
Request.Operation.Speech
Recognition
Format of the returned recognition result. 0: recognition result text, including the list of segment timestamps; 1: recognition result details, including the list of word timestamps (generally used to generate subtitles and for the 16k Mandarin engine only).
Integer
Yes
FilterDirty
Request.Operation.Speech
Recognition
Whether to filter restricted words (for the Mandarin engine only). 0 (default value): does not filter; 1: filters; 2: replaces restricted words with "*".
Integer
No
FilterModal
Request.Operation.Speech
Recognition
Whether to filter interjections (for the Mandarin engine only). 0 (default value): does not filter; 1: filters; 2: filters strictly.
Integer
No
ConvertNumMode
Request.Operation.Speech
Recognition
Whether to intelligently convert Chinese numbers to Arabic numerals (for the Mandarin engine only). 0: directly outputs Chinese numbers; 1 (default value): intelligently converts based on the scenario.
Integer
No
﻿
Output has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Required
Region
Request.Operation.Output
Bucket region
String
Yes
Bucket
Request.Operation.Output
Result storage bucket
String
Yes
Object
Request.Operation.Output
Result filename
String
Yes
Response
Response headers
This API only returns common response headers. For more information, see Common Response Headers.
Response body
The response body returns application/xml data. The following contains all the nodes:
<Response>
  <JobsDetail>
    <Code></Code>
    <Message></Message>
    <JobId></JobId>
    <State></State>
    <CreationTime></CreationTime>
    <QueueId></QueueId>
    <Tag><Tag>
    <Input>
      <Object></Object>
    </Input>
    <Operation>
      <SpeechRecognition></SpeechRecognition>
      <Output>
        <Region></Region>
        <Bucket></Bucket>
        <Object></Object>
      </Output>
      <MediaInfo>
      </MeidaInfo>
    </Operation>
  </JobsDetail>
</Response>
The nodes are as described below:
Node Name (Keyword)
Parent Node
Description
Type
Response
None
Response container
Container
Response has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
JobsDetail
Response
Job details
Container
JobsDetail has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Code
Response.JobsDetail
Error code, which is meaningful only if State is Failed
String
Message
Response.JobsDetail
Error description, which is meaningful only if State is Failed
String
JobId
Response.JobsDetail
Job ID
String
Tag
Response.JobsDetail
Job type: SpeechRecognition
String
State
Response.JobsDetail
Job status. Valid values: Submitted, Running, Success, Failed, Pause, Cancel
String
CreationTime
Response.JobsDetail
Job creation time
String
QueueId
Response.JobsDetail
ID of the queue which the job is in
String
Input
Response.JobsDetail
Input resource address of the job
Container
Operation
Response.JobsDetail
Operation rule
Container
Input has the following sub-nodes:
Same as the Request.Input node in the request.
Operation has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
TemplateId
Response.JobsDetail.Operation
Job template ID
String
Output
Response.JobsDetail.Operation
File output address
Container
MediaInfo
Response.JobsDetail.Operation
Transcoding output video information. This node will not be returned if there is no output video.
Container
Output has the following sub-nodes:
Same as the Request.Operation.Output node in the request.
SpeechRecognition has the following sub-nodes:
Same as the Request.Operation.SpeechRecognition node in the request.
Error codes
There are no special error messages for this request. For common error messages, see Error Codes.

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

Node Name (Keyword)	Parent Node	Description	Type	Required
Request	None	Request container	Container	Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service free trial

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

E-commerce

E-commerce retail solutions

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Financial Services

Financial Services Solution

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha

Cloud Workload Protection Platform

Data Security Governance Center

Key Management Service