tencent cloud

Feedback

Android&iOS&Windows&Mac

Last updated: 2024-07-05 17:57:19

    Description

    The Voice-to-Text feature can recognize your sent or received successfully voice messages, and convert them into text.
    Note:
    Voice-to-Text is a value-added paid feature, currently in beta. You can contact us through the Telegram Technical Support Group to enable a full feature experience.
    This feature is supported only by the Enhanced SDK v7.4 or later.

    Display Effect

    You can use this feature to achieve the text conversion effect shown below:
    
    
    

    API Description

    Speech-to-Text

    You can call the convertVoiceToText (Android/iOS and macOS/Windows) interface to convert voice into text.
    The description of the interface parameters is as follows:
    Input Parameters
    Meaning
    Description
    language
    Identified Target Language
    1. If your mainstream users predominantly use Chinese and English, the language parameter can be passed as an empty string. In this case, we default to using the Chinese-English model for recognition.
    2. If you want to specify the target language for recognition, you can set it to a specific value. For the languages currently supported, please refer to Language Support.
    callback
    Recognition Result Callback
    The result refers to the recognized text.
    Warning:
    The voice to be recognized must be set to a 16k sampling rate, otherwise, it may fail.
    
    Below is the sample code:
    Android
    iOS & Mac
    Windows
    // Get the V2TIMMessage object from VMS
    V2TIMMessage msg = messageList.get(0);
    if (msg.elemType == V2TIM_ELEM_TYPE_SOUND) {
    // Retrieve the soundElem from V2TIMMessage
    V2TIMSoundElem soundElem = msg.getSoundElem();
    // Invoke speech-to-text conversion, using the Chinese-English recognition model by default
    soundElem.convertVoiceToText("",new V2TIMValueCallback<String>() {
    @Override
    public void onError(int code, String desc) {
    TUIChatUtils.callbackOnError(callBack, TAG, code, desc);
    String str = "convertVoiceToText failed, code: " + code + " desc: " + desc;
    ToastUtil.show(str,true, 1);
    }
    @Override
    public void onSuccess(String result) {
    // If recognition is successful, 'result' will be the recognition result
    String str = "convertVoiceToText succeed, result: " + result;
    ToastUtil.show(str, true, 1);
    }
    });
    }
    // Get the V2TIMMessage object from VMS
    V2TIMMessage *msg = messageList[0];
    if (msg.elemType == V2TIM_ELEM_TYPE_SOUND) {
    // Retrieve the soundElem from V2TIMMessage
    V2TIMSoundElem *soundElem = msg.soundElem;
    // Invoke speech-to-text conversion, using the Chinese-English recognition model by default
    [soundElem convertVoiceToText:@"" completion:^(int code, NSString *desc, NSString *result) {
    // If recognition is successful, 'result' will be the recognition result
    NSLog(@"convertVoiceToText, code: %d, desc: %@, result: %@", code, desc, result);
    }];
    }
    template <class T>
    class ValueCallback final : public V2TIMValueCallback<T> {
    public:
    using SuccessCallback = std::function<void(const T&)>;
    using ErrorCallback = std::function<void(int, const V2TIMString&)>;
    
    ValueCallback() = default;
    ~ValueCallback() override = default;
    
    void SetCallback(SuccessCallback success_callback, ErrorCallback error_callback) {
    success_callback_ = std::move(success_callback);
    error_callback_ = std::move(error_callback);
    }
    
    void OnSuccess(const T& value) override {
    if (success_callback_) {
    success_callback_(value);
    }
    }
    void OnError(int error_code, const V2TIMString& error_message) override {
    if (error_callback_) {
    error_callback_(error_code, error_message);
    }
    }
    
    private:
    SuccessCallback success_callback_;
    ErrorCallback error_callback_;
    };
    
    auto callback = new ValueCallback<V2TIMString>{};
    callback->SetCallback(
    [=](const V2TIMString& result) {
    // Speech-to-text conversion successful, 'result' will be the conversion result
    delete callback;
    },
    [=](int error_code, const V2TIMString& error_message) {
    // Speech-to-Text Conversion failed
    delete callback;
    });
    
    // Get the V2TIMMessage object from VMS
    V2TIMMessage *msg = messageList[0];
    // Retrieve the soundElem from V2TIMMessage
    V2TIMElem *elem = message.elemList[0];
    if (elem->elemType == V2TIM_ELEM_TYPE_SOUND) {
    V2TIMSoundElem *sound_elem = (V2TIMSoundElem *)elem;
    // Invoke speech-to-text conversion, using the Chinese-English recognition model by default
    sound_elem->ConvertVoiceToText("", &convertVoiceToTextCallback);
    }

    Language Support

    The currently supported target languages for recognition are as follows:
    Supported Languages
    Input Parameter Settings
    Mandarin Chinese
    "zh (cmn-Hans-CN)"
    Cantonese Chinese
    "yue-Hant-HK"
    English
    "en-US"
    Japanese (Japan)
    "ja-JP"
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support