tencent cloud

Feedback

Tencent Cloud AI Digital Human

Customization API

Last updated: 2024-07-18 17:49:06
Use this API to submit customization requests. Query the stages of customization and related information through the Progress Query API.

Calling Protocol

HTTPS + JSON
POST /v2/ivh/assetmanager/customservice/make
Header Content-Type: application/json;charset=utf-8

Request Parameters

Parameters
Type
Mandatory
Description
AnchorName
string
Yes
Anchor name:
1. This name is mainly used to identify the customized avatar and voice and can be customized according to actual needs.
2. Naming reference: If there is only one customization for the anchor, it can be named directly after the anchor, such as "Tom". For better identification, you can also add the name of the clothing, such as "Tom in a Blue Suit".
3. Not more than 50 characters and not fewer than 2 characters. Only Chinese characters, letters, numbers, underscores, and hyphens are allowed.
4. Duplicate names are not allowed.
MakeType
string
Yes
Customization Categories: IMAGE: Studio Avatar image customization
IMAGE_GENERAL: Instant Avatar image customization
IMAGE_4K: 4K Studio Avatar image customization
IMAGE_PHOTO: Photo Avatar image customization
VOICE: Voice Clone (basic edition) ZERO_SHOT_VOICE: Voice Clone (ultra edition)
IdentityCosUrl
string
No
Except for the IMAGE_PHOTO and ZERO_SHOT_VOICE customization types, fill in either the IdentityCosUrl or another customization type, or both.
Requirements for the URL address of the video format authorization letter:
1. The URL address should be the resource URL uploaded to the specified path through uploading the material to Tencent Cloud COS, with an added idcard path, such as domain name/customer-pipeline/{digit}/{uuid}/idcard/a.mp4.
2. This format is primarily for oral authorization letters, and written authorization letters can also be submitted as clear and complete videos.
IdentityWrittenCosUrl
string
No
Except for the IMAGE_PHOTO and ZERO_SHOT_VOICE customization types, fill in either the IdentityCosUrl or another customization type, or both.
Requirements for the URL address of the PDF format authorization letter:
1. The URL address should be the resource URL uploaded to the specified path through uploading the material to Tencent Cloud COS, with an added idcard path, such as domain name/customer-pipeline/{digit}/{uuid}/idcard/b.pdf.
2. This format is primarily for written authorization letters, submitted as clear and complete scanned copies.
MaterialCosUrl
string
No
Except for the ZERO_SHOT_VOICE customization type, all other customization types are required.
Requirements for the URL address of image customization materials:
1. The URL address should be the resource URL uploaded to the specified path through uploading the material to Tencent Cloud COS, with an added video path, such as /customer-pipeline/{digit}/{uuid}/video/c.mp4.
2. The video size should not exceed 5 GB; for 4K videos, the size should not exceed 10 GB.
3. Video duration: 2-10 minutes for Studio Avatar, 1-10 minutes for Instant Avatar, and 2-10 minutes for 4K Studio Avatar.
4. Video resolution: 1080P or 4K (3840*2160); for high-precision version customization, it must be 4K.
5. Video aspect ratio: 16:9 (or 9:16)
6. Video frame rate: Not less than 25 fps and not more than 60 fps.
7. Video format: MP4 and MOV

Requirements for the URL address of Voice Clone materials: 1. The URL address should be the resource URL uploaded to the specified path through uploading the material to Tencent Cloud COS, with an added audio path, such as /customer-pipeline/{digit}/{uuid}/audio/c.zip. 2. Zip file format: .zip format; a single zip file is used to customize one voice. Do not create new folders when compressing, just select all wav files directly for compression. 3. For the audio files within a single zip file, here are the must-knows: ①Audio quantity: Each zip file can contain one or more wav format audio files, with a total of no more than 10 files. ②Audio size: The total size of the audio files in each zip file should not exceed 1 GB. ③Audio format: Each audio file must be in wav format. Other audio formats should be converted to wav before compression into a zip file. ④Audio sample rate: The sample rate should be 24 kHz or higher, with 24 kHz or 36 kHz recommended. ⑤Audio naming: Names should not contain spaces or special characters, and the file extension should be in lowercase ".wav".

Requirements for the URL address of Photo Avatar materials: 1. The URL address should be the resource URL uploaded to the specified path through uploading the material to Tencent Cloud COS, with an added photo path, such as /customer-pipeline/{digit}/{uuid}/photo/example.png. 2. Image Name: Not fewer than 2 characters, and can only contain Chinese characters, letters, numbers, underscores, and hyphens. Image Format: Supports jpg, jpeg, png, and webp. Image size: No larger than 16 MB. Image aspect ratio: Supports 1:1, 9:16, 16:9, 4:3. 3. The photo should be a clear front view of the person, with the face centered, a natural emoji, and the mouth closed.
IsHaveBackground
bool
No
Image customization type: Whether the trained image retains the original background. The default is "No", meaning that the original background is not retained, and the background can be changed as needed during application.
SexType
string
Yes
Gender:
MALE: Male
FEMALE: Female
Notes
string
No
Customized remarks, within 100 characters.
TextDriver
string
No
Text content used to generate the driving demo, 4-1000 characters allowed (including SSML tags, and each Chinese character is considered one character).
VoiceDriverCosFile
string
No
Requirements for the audio file path for generating the driving demo:
1. The URL address should be the resource URL uploaded to the specified path through uploading the material to Tencent Cloud COS, with an added audio path, such as /customer-pipeline/{digit}/{uuid}/audio/example.wav.
2. The audio file size should not exceed 10 MB, and the supported formats are WAV, MP3, WMA, M4A, and AAC.
AudioId
string
No
For the ZERO_SHOT_VOICE customization type, it is required to fill in the AudioId returned after passing the Query Audio Quality Inspection Task Progress.


Response Parameter

Parameters
Type
Mandatory
Description
TaskId
string
Yes
The task ID being produced. Access the Progress Query API with the taskId to obtain the production progress and results.

Request Sample

{
"Header": {},
"Payload": {
"AnchorName": "Jingxuan in a green dress, sitting pose",
"MakeType": "IMAGE",
"IdentityCosUrl": "XXXX",
"MaterialCosUrl": "YYYY",
"IsRemoveBackground": true
}
}

Response Sample

{
"Header": {
"Code": 0,
"DialogID": "",
"Message": "",
"RequestID": "123"
},
"Payload": {
"TaskId": "666"
}
}
 

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon