The real-time speech recognition SDK and demo for iOS can be downloaded here.
Add the following settings in the info.plist
project:
Set the NSAppTransportSecurity
policy by adding the following content:
<key>NSAppTransportSecurity</key>
<dict>
<key>NSExceptionDomains</key>
<dict>
<key>qcloud.com</key>
<dict>
<key>NSExceptionAllowsInsecureHTTPLoads</key>
<true/>
<key>NSExceptionMinimumTLSVersion</key>
<string>TLSv1.2</string>
<key>NSIncludesSubdomains</key>
<true/>
<key>NSRequiresCertificateTransparency</key>
<false/>
</dict>
</dict>
</dict>
Request the system's mic permission and add the following content:
<key>NSMicrophoneUsageDescription</key>
<string>Your mic is required to capture audios</string>
Add dependent libraries in the project and add the following libraries in Build Phases' Link Binary With Libraries
:
The libraries are added as shown below:
The following describes the connection processes and demos for capturing audio for recognition with the built-in recorder and providing audio data respectively.
Import the header file of QCloudSDK
and change the filename extension from .m
to .mm
.
#import<QCloudSDK/QCloudSDK.h>
Create a QCloudConfig
instance.
//1. Create a `QCloudConfig` instance
QCloudConfig *config = [[QCloudConfig alloc] initWithAppId:kQDAppId
secretId:kQDSecretId
secretKey:kQDSecretKey
projectId:kQDProjectId];
config.sliceTime = 600; // The length of the audio segment is 600 ms
config.enableDetectVolume = YES; // Specify whether to detect the volume
config.endRecognizeWhenDetectSilence = YES; // Specify whether to stop recognition when silence is detected
Create a QCloudRealTimeRecognizer
instance.
QCloudRealTimeRecognizer *recognizer = [[QCloudRealTimeRecognizer alloc] initWithConfig:config];
Set the delegate and implement the QCloudRealTimeRecognizerDelegate method.
recognizer.delegate = self;
Start recognition.
[recognizer start];
End recognition.
[recognizer stop];
Import the header file of QCloudSDK
and change the filename extension from .m
to .mm
.
#import<QCloudSDK/QCloudSDK.h>
Create a QCloudConfig
instance.
//1. Create a `QCloudConfig` instance
QCloudConfig *config = [[QCloudConfig alloc] initWithAppId:kQDAppId
secretId:kQDSecretId
secretKey:kQDSecretKey
projectId:kQDProjectId];
config.sliceTime = 600; // The length of the audio segment is 600 ms
config.enableDetectVolume = YES; // Specify whether to detect the volume
config.endRecognizeWhenDetectSilence = YES; // Specify whether to stop recognition when silence is detected
Customize QCloudDemoAudioDataSource
and implement the QCloudAudioDataSource protocol.
QCloudDemoAudioDataSource *dataSource = [[QCloudDemoAudioDataSource alloc] init];
Create a QCloudRealTimeRecognizer
instance.
QCloudRealTimeRecognizer *recognizer = [[QCloudRealTimeRecognizer alloc] initWithConfig:config dataSource:dataSource];
Set the delegate and implement the QCloudRealTimeRecognizerDelegate method.
recognizer.delegate = self;
Start recognition.
[recognizer start];
End recognition.
[recognizer stop];
QCloudRealTimeRecognizer
is the real-time speech recognition class, which provides two initialization methods.
/**
* Initialization method where the built-in recorder is used to capture audios
* @param config Configuration parameter. For more information, see the definition of `QCloudConfig`.
*/
- (instancetype)initWithConfig:(QCloudConfig *)config;
/**
* Initialization method which will be called to pass in audio data
* @param config Configuration parameter. For more information, see the definition of `QCloudConfig`.
* @param dataSource Data source of audio data. You must implement the `QCloudAudioDataSource` protocol.
*/
- (instancetype)initWithConfig:(QCloudConfig *)config dataSource:(id<QCloudAudioDataSource>)dataSource;
/**
* Initialization method - direct authentication
* @param appid Tencent Cloud `appId`
* @param secretId Tencent Cloud `secretId`
* @param secretKey Tencent Cloud `secretKey`
* @param projectId Tencent Cloud `projectId`
*/
- (instancetype)initWithAppId:(NSString *)appid
secretId:(NSString *)secretId
secretKey:(NSString *)secretKey
projectId:(NSString *)projectId;
/**
* Initialization method - authentication through STS temporary credentials
* @param appid Tencent Cloud `appId`
* @param secretId Tencent Cloud temporary `secretId`
* @param secretKey Tencent Cloud temporary `secretKey`
* @param token Token
*/
- (instancetype)initWithAppId:(NSString *)appid
secretId:(NSString *)secretId
secretKey:(NSString *)secretKey
token:(NSString *)token;
/**
* One real-time speech recognition is divided into multiple flows, each of which can be understood as a sentence, and multiple sentences can be included in one recognition.
* Each flow contains multiple `seq` audio data packets, and the `seq` of each flow starts from 0.
*/
@protocol QCloudRealTimeRecognizerDelegate <NSObject>
@required
/**
* Recognition result of each audio package segment
* @param response Recognition result of the audio segment
*/
- (void)realTimeRecognizerOnSliceRecognize:(QCloudRealTimeRecognizer *)recognizer response:(QCloudRealTimeResponse *)response;
@optional
/**
* Callback for recognition success
@param recognizer Real-time speech recognition instance
@param result Total text recognized at one time
*/
- (void)realTimeRecognizerDidFinish:(QCloudRealTimeRecognizer *)recognizer result:(NSString *)result;
/**
* Callback for recognition failure
* @param recognizer Real-time speech recognition instance
* @param error Error message
* @param voiceId The `voiceId` attached to the error returned by the backend
*/
- (void)realTimeRecognizerDidError:(QCloudRealTimeRecognizer *)recognizer error:(NSError *)error voiceId:(NSString * _Nullable) voiceId;
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
/**
* Callback for recording start
* @param recognizer Real-time speech recognition instance
* @param error Error message for recording start failure
*/
- (void)realTimeRecognizerDidStartRecord:(QCloudRealTimeRecognizer *)recognizer error:(NSError *)error;
/**
* Callback for recording end
* @param recognizer Real-time speech recognition instance
*/
- (void)realTimeRecognizerDidStopRecord:(QCloudRealTimeRecognizer *)recognizer;
/**
* Real-time callback for recording volume
* @param recognizer Real-time speech recognition instance
* @param volume Audio volume level in the range of -40–0
*/
- (void)realTimeRecognizerDidUpdateVolume:(QCloudRealTimeRecognizer *)recognizer volume:(float)volume;
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
/**
* Audio stream recognition start
* @param recognizer Real-time speech recognition instance
* @param voiceId `voiceId` of the audio stream, which is the unique identifier
* @param seq Sequence number of the flow
*/
- (void)realTimeRecognizerOnFlowRecognizeStart:(QCloudRealTimeRecognizer *)recognizer voiceId:(NSString *)voiceId seq:(NSInteger)seq;
/**
* Audio stream recognition end
* @param recognizer Real-time speech recognition instance
* @param voiceId `voiceId` of the audio stream, which is the unique identifier
* @param seq Sequence number of the flow
*/
- (void)realTimeRecognizerOnFlowRecognizeEnd:(QCloudRealTimeRecognizer *)recognizer voiceId:(NSString *)voiceId seq:(NSInteger)seq;
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
/**
* Audio stream recognition start
* @param recognizer Real-time speech recognition instance
* @param voiceId `voiceId` of the audio stream, which is the unique identifier
* @param seq Sequence number of the flow
*/
- (void)realTimeRecognizerOnFlowStart:(QCloudRealTimeRecognizer *)recognizer voiceId:(NSString *)voiceId seq:(NSInteger)seq;
/**
* Audio stream recognition end
* @param recognizer Real-time speech recognition instance
* @param voiceId `voiceId` of the audio stream, which is the unique identifier
* @param seq Sequence number of the flow
*/
- (void)realTimeRecognizerOnFlowEnd:(QCloudRealTimeRecognizer *)recognizer voiceId:(NSString *)voiceId seq:(NSInteger)seq;
@end
If you provide audio data instead of capturing audio data with the recorder built in the SDK, you need to implement all methods in this protocol in the same way as used for the implementation of QDAudioDataSource
in the demo project.
/**
* Data source of audio data. If you want to provide audio data on your own, you need to implement all methods in this protocol.
* Provide audio data that meets the following requirements:
* Sample rate: 16 kHz
* Audio format: PCM
* Encoding: 16-bit mono-channel
*/
@protocol QCloudAudioDataSource <NSObject>
@required
/**
* It identifies whether the data source has started to work and needs to be set to `YES` after the `start` is executed and to `NO` after the `stop` is executed.
*/
@property (nonatomic, assign) BOOL running;
/**
* The SDK will call the `start` method. Implementing the class of this protocol requires initializing the data source.
*/
- (void)start:(void(^)(BOOL didStart, NSError *error))completion;
/**
* The SDK will call the `stop` method. Implementing the class of this protocol requires stopping supplying data.
*/
- (void)stop;
/**
* The SDK will call this method of the object that implements the protocol to read audio data. If the audio data is less than `expectLength`, `nil` will be returned directly.
* @param expectLength The number of bytes expected to be read. If the returned `NSData` is less than `expectLength` bytes, the SDK will report an exception.
*/
- (nullable NSData *)readData:(NSInteger)expectLength;
@end
Was this page helpful?