tencent cloud

Detect AI Fraud with Tencent eKYC！Intercept 99%+ Deepfake Attacks!

Tencent Cloud AI Digital Human

Product Overview

Purchase Guide

Process for Purchasing with Vouchers

Refund Instructions

Digital Human Platform Operation Guide

Accessing Platform

Avatar Production and Asset Management

Custom Asset Management

Personal Asset Management

Asset Renewal Management

Sub-user and Permission Management

Broadcast Digital Human Video Generation and Management

Operations Management and Analysis

Digital Human Conversation Interaction Application and Management

Configuration Process Introduction

Project Creation and Management

Image and Output Settings

Quick Experience and Integration

Introduction of Avatar

Introduction to Image Categories

Basic Image Library

3D Basic Image Library

2D Small Sample (General Mouth Shape) Basic Image Library

2D Small Sample (Exclusive Mouth Shape) Basic Image Library

2D Boutique Basic Image Library

Guide on Avatar and Voice Clone

Overview

Avatar Recording Guide - Studio Avatar

Avatar Recording Guide - Instant Avatar

Avatar Recording Guide - 4K Version

Voice Clone Recording Guide - Basic Edition

Voice Clone Recording Tool - Basic Edition

Voice Clone Recording Guide - Ultra-fast Version

Custom Material Submission Guide

FAQs

Server API Integration

Avatar aPaas API Calling Methods

Avatar Image Customization and Voice Clone API Documentation

Video Generation Service API Documentation

Overview

Digital Human aPaaS API Calling Methods

Audio Production API

Video Production API - Basic Edition

Audio and Video Production Progress Query API

Video Production API - Advanced Version

Customer Resource Query Anchor API

Querying All Images of a Specific Anchor

Querying the Supported Timbres for VirtualmanKey (to Be Deprecated)

Querying the Supported Actions for VirtualmanKey

Appendix

Appendix I: Result Code Dictionary

Appendix II: Callback Request Body Format

API Integration FAQs

Interactive Digital Human Service API Documentation

Personal Asset Management API Documentation

Overview

Digital Human aPaaS API Calling Methods

Querying for Avatar List by Pagination API

Querying Supported Timbres for Avatars (to be Deprecated)

Querying Customer Service Asset Information

Querying Timbre Lists by Pagination

Querying Image Asset Information - Query Anchor

Querying Image Asset Information - Querying all Avatars under the Anchor

Querying the List Of Actions Supported by the Avatar

Appendix 1 - Service Asset Type

Appendix 2 - Emotional Style

Appendix 3 - Digital Human Type

Appendix 4 - Language List

API Integration FAQs

Client SDK Integration

H5 SDK Integration

HTML5 SDK API Description for Client Rendering

Client Rendering API Integration

Overview

Create a Persistent Connection Channel

Endpoint Rendering Driver API

Digital Human SSML Markup Language Specification

DSA (Data Sharing Agreement)

FAQs

DocumentationTencent Cloud AI Digital HumanServer API IntegrationInteractive Digital Human Service API DocumentationInstruction-drivenVoice-driven Instructions

Voice-driven Instructions

Download PDF

Last updated: 2024-07-19 10:08:06

Voice-driven Instructions

Last updated: 2024-07-19 10:08:06

Download PDF

After you Create Long Connection Channel, you can use a websocket persistent connection to send audio to drive the digital human.
Request Parameters
Parameter name
Type
Required
Description
ReqId
String
Yes
A unique identifier for a single drive. Each segment of audio is assigned a UUID value.
SessionId
String
Yes
Unique identifier for the session.
Command
String
Yes
SEND_AUDIO; send the audio.
Data
Data
Yes
Data Object
Data
Name
Type
Required
Description
Audio
string
Yes
The byte array of the original audio data, encoded into a string via Base64. Only supports: format-PCM, sampling rate-16kHz, sampling bit depth-16bits, audio track-mono.
Seq
int
Yes
Audio packet sequence number, which must start from 1.
IsFinal
bool
No
The default value is false.
Note:
1. If the data is being sent in real-time from a microphone, it can be sent every 160 ms (5120B) without any waiting interval. If the data is being sent from an offline audio file, the packet size should be 160 ms (5120B) with a 120 ms interval between packets.
2. The size of the last packet should be based on the actual remaining data (must be less than 160 ms).
3. After all data packets have been sent, an empty data packet with IsFinal=true (with the Audio field left empty) must be sent to signal the end of the audio session and return the Digital Human to a silent state.
4. The real-time rate of sending audio must be between [0.75, 1]. A rate lower than 0.75 will trigger throttling, while a rate higher than 1 will cause video stuttering. For example, for a 160 ms audio packet size, the sending interval must not be less than 120 ms or more than 160 ms.
Request Sample
{
    "Header": {},
    "Payload": {
        "ReqId": "d7aa08da33dd4a662ad5be508c5b77cf",
        "SessionId": "m123adfafvbadsafd",
        "Command": "SEND_AUDIO",
        "Data": {
            "Audio": "The value of the audio binary data encoded in Base64",
            "Seq": 0,
            "IsFinal": false
        }
    }
}
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

No

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support

Hong Kong, China

+852 800 906 020 (Toll Free)

United States

+1 844 606 0804 (Toll Free)

United Kingdom

+44 808 196 4551 (Toll Free)

Canada

+1 888 605 7930 (Toll Free)

Australia

+61 1300 986 386 (Toll Free)

EdgeOne hotline

+852 300 80699

More local hotlines coming soon

Parameter name	Type	Required	Description
ReqId	String	Yes	A unique identifier for a single drive. Each segment of audio is assigned a UUID value.
SessionId	String	Yes	Unique identifier for the session.
Command	String	Yes	SEND_AUDIO; send the audio.
Data	Data	Yes	Data Object

Name	Type	Required	Description
Audio	string	Yes	The byte array of the original audio data, encoded into a string via Base64. Only supports: format-PCM, sampling rate-16kHz, sampling bit depth-16bits, audio track-mono.
Seq	int	Yes	Audio packet sequence number, which must start from 1.
IsFinal	bool	No	The default value is false.

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha