Scene Overview
Scene Introduction
1V1 Audio and Video Call is a high-frequency usage scene similar to WeChat calls. TRTC (Tencent Real-Time Communication) has an audio call latency of less than 300 ms, a packet loss resistance rate of over 80%, and can resist network jitter of over 1000 ms, ensuring smooth and stable audio calls even in weak network environments. Video calls support high-definition quality of 720 p, 1080 p, 2 K, and 2 K+ (specific devices), providing high-quality video call services. Combined with the rich call signaling management APIs provided by Chat, it easily adapts to various use cases. In addition, we also offer Audio/Video Call scene-based components that can be directly reused, significantly reducing development costs. For details, see Component Introduction. Scene Approach
The 1V1 Audio and Video Call feature not only incorporates the basic functionality of a WeChat-like calling application, but it also has the potential to transform into a wide range of diverse use cases. Below are a few common scenes briefly introduced.
Game Socializing
In the gaming field, Audio/Video Call facilitates real-time interactions among players, enhancing the overall gaming experience. Players can engage in voice or video chats with friends within the game, share gaming experiences, techniques, or collaborate on policy. Nowadays, audio and video calls are extensively utilized in game socializing features, such as team voice chats.
Online Customer Service
1V1 Audio and Video Call enables customers to communicate with customer service representatives in real-time, resulting in more effective problem-solving. Compared to traditional text-based customer service, audio and video calls allow customers to describe their issues more vividly and enable service personnel to understand customer needs more clearly, thus improving the efficiency of problem resolution. For instance, dispute resolution and insurance consulting are excellent use cases for this type of communication.
Online Consultation
In the healthcare field, 1V1 Audio and Video Call enables patients to consult with doctors remotely. Patients can describe their symptoms via Audio/Video Call, and doctors can make preliminary diagnoses based on the descriptions. This method not only saves time and energy for patients but also allows doctors to serve more patients, improving the usage of medical resources.
Financial Review
In the financial field, 1V1 Audio and Video Call can be utilized for identity verification and risk assessment. When performing online financial management, account opening, or face-to-face signing, in accordance with national regulatory requirements, audio and video recording services must be provided to create transaction record videos for archiving and reference. Audio/Video Calls are extensively used in the financial review sector, not only enhancing the efficiency of reviews but also mitigating the risk of fraud.
Implementation Scheme
Typically, implementing a basic 1V1 Audio and Video Call scene involves multiple feature modules. We can divide the implementation scheme into three parts: Call Signaling Control, Audio/Video Call, Call Feature Control. The key actions and features of each part are shown in the table below: |
Call Signaling Control | Call, Answer, Decline, Hang up |
Audio/Video Call | Voice Call, Video Call |
Call Feature Control | Enable/Disable Microphone/Camera/Speaker, Earpiece/Hands-free Switching, Camera Switching, Window Size Switching, Network Status Prompt, Call Duration Statistics |
The complete implementation of Audio/Video Call scenes often relies on the combined capabilities of real-time audio and video and instant messaging. The real-time audio and video module is responsible for audio and video communication and device status control, while the instant messaging module handles signaling transmission and message push. The main architecture of Audio/Video Call scene is shown below:
Call Signaling Control
Based on a complete call process, call signaling can be divided into Call,Answer,Decline,Hang up. Taking Chat as an example, the following describes the specific implementation logic of the call signaling control after completing the Log-in Operation. Call
Call signaling can be subdivided into initiating a call, canceling a call, and call timeout, and their invocation sequence is shown below:
Initiating a call: The caller sends a call invitation to the callee, displays the call page, and plays the ringtone; the callee receives the invitation, displays the call page, and plays the ringtone.
Canceling a call: The caller can cancel the call invitation midway, destroy the call page, and stop the ringtone; the callee receives the cancellation notification, terminates the call page, and stops the ringtone.
Call Timeout: If there is no response beyond the invite's predefined timeout period, both the caller and callee will receive a timeout notification, terminate the call page, and stop the ringtone. Answer
Upon receiving a call invitation from the caller, the callee can choose to answer the call, initiating the Audio/Video Call.
After answering the call, both parties start interactive audio and video communication. For more details on implementation logic, see Audio/Video Call. Decline
The decline signaling can be subdivided into active decline and busy decline, and their call sequence is shown below:
Proactive Rejection: The callee rejects the call invitation upon receipt, also terminates the call page and stops the ringtone; the caller receives the rejection notice, also terminates the call page and stops the ringtone.
Busy Line Rejection: Upon receiving the call invitation, the callee directly rejects the invitation if a call is already in progress; the caller receives the rejection notice, also terminates the call page and stopping the ringtone.
Note:
Both proactive and busy line rejections use the reject signal for implementation, but it's important to distinguish them through the custom data field in the signaling. Hang up
During a call, either the caller or the callee can opt to hang up at any time, thus ending the audio or video call.
Taking the caller hanging up as an example: The caller performs the exit operation, the callee receives a remote exit notification, also performs the exit operation, and the call between both parties ends.
Note:
The hangup operation does not use the IM signaling notification but is implemented through the TRTC (Tencent Real-Time Communication) remote user exit callback notification.
Audio/Video Call
Audio Call
After connecting, both parties need to enter the same TRTC (Tencent Real-Time Communication) room, start local audio capture and streaming, and mutually pull each other's audio stream to achieve a voice call.
The calling sequence for starting and ending a call's audio and video-related APIs is shown in the figure below:
Note:
Starting local audio capture startLocalAudio
allows you to set audio quality parameters at the same time. For voice calls, it's recommended to set TRTC_AUDIO_QUALITY_SPEECH
. Under the SDK's default automatic subscription mode, after a user enters a room, they will immediately receive the audio stream from that room, which will be automatically decoded and played without manual pulling.
Video Call
During the calling phase, both parties must set video encode parameters and start local video preview. After connecting, both parties need to enter the same TRTC (Tencent Real-Time Communication) room, start local audio capture and streaming, and mutually pull each other's audio and video streams to achieve a video call.
The calling sequence for initiating a call, starting a call, and ending a call's audio and video-related APIs is shown in the figure below:
Note:
Before entering the room, call startLocalPreview
, and the SDK will only start the camera preview, waiting until you call enterRoom
to start streaming.
Start local audio capture with startLocalAudio
, where you can also set the audio parameter. For video calls, it is recommended to set to TRTC_AUDIO_QUALITY_SPEECH
. In the SDK's default automatic subscription mode, audio is automatically decoded and played back, while video requires manual invocation of startRemoteView
to pull and render the remote video stream.
Call Feature Control
During Audio/Video Call, various feature controls might be involved, such as: turning on/off the microphone, turning on/off the speaker, turning on/off the camera, hands-free/earpiece switching, camera switching, window size switching, network status prompt, call duration statistics. Most of these feature controls and status prompts are facilitated through the TRTC (Tencent Real-Time Communication) SDK. Below, we will introduce their implementations one by one. Turn on/off Microphone
mTRTCCloud.muteLocalAudio(false);
mTRTCCloud.muteLocalAudio(true);
[self.trtcCloud muteLocalAudio:NO];
[self.trtcCloud muteLocalAudio:YES];
Turn on/off speaker
mTRTCCloud.muteAllRemoteAudio(false);
mTRTCCloud.muteAllRemoteAudio(true);
[self.trtcCloud muteAllRemoteAudio:NO];
[self.trtcCloud muteAllRemoteAudio:YES];
Turn on/off camera
mTRTCCloud.startLocalPreview(isFrontCamera, videoView);
mTRTCCloud.stopLocalPreview();
[self.trtcCloud startLocalPreview:self.isFrontCamera view:self.videoView];
[self.trtcCloud stopLocalPreview];
Hands-free/Earpiece Switching
mTRTCCloud.getDeviceManager().setAudioRoute(TXDeviceManager.TXAudioRoute.TXAudioRouteEarpiece);
mTRTCCloud.getDeviceManager().setAudioRoute(TXDeviceManager.TXAudioRoute.TXAudioRouteSpeakerphone);
[[self.trtcCloud getDeviceManager] setAudioRoute:TXAudioRouteEarpiece];
[[self.trtcCloud getDeviceManager] setAudioRoute:TXAudioRouteSpeakerphone];
Camera Switching
boolean isFrontCamera = mTRTCCloud.getDeviceManager().isFrontCamera();
mTRTCCloud.getDeviceManager().switchCamera(!isFrontCamera);
BOOL isFrontCamera = [[self.trtcCloud getDeviceManager] isFrontCamera];
[[self.trtcCloud getDeviceManager] switchCamera:!isFrontCamera];
Window Size Switching
mTRTCCloud.updateLocalView(previewView);
mTRTCCloud.updateRemoteView(userId, TRTCCloudDef.TRTC_VIDEO_STREAM_TYPE_BIG, previewView);
[self.trtcCloud updateLocalView:self.previewView];
[self.trtcCloud updateRemoteView:self.previewView streamType:TRTCVideoStreamTypeBig forUser:self.userId];
Network Status Prompt
@Override
public void onNetworkQuality(TRTCCloudDef.TRTCQuality localQuality, ArrayList<TRTCCloudDef.TRTCQuality> remoteQuality) {
if (remoteQuality.size() > 0) {
switch (remoteQuality.get(0).quality) {
case TRTCCloudDef.TRTC_QUALITY_Excellent:
Log.i(TAG, "The other party's network is very good");
break;
case TRTCCloudDef.TRTC_QUALITY_Good:
Log.i(TAG, "The other party's network is quite good");
break;
case TRTCCloudDef.TRTC_QUALITY_Poor:
Log.i(TAG, "The other party's network is average");
break;
case TRTCCloudDef.TRTC_QUALITY_Bad:
Log.i(TAG, "The other party's network is poor");
break;
case TRTCCloudDef.TRTC_QUALITY_Vbad:
Log.i(TAG, "The other party's network is very poor");
break;
case TRTCCloudDef.TRTC_QUALITY_Down:
Log.i(TAG, "The other party's network is extremely poor");
break;
default:
Log.i(TAG, "Undefined");
break;
}
}
}
#pragma mark - TRTCCloudDelegate
- (void)onNetworkQuality:(TRTCQualityInfo *)localQuality remoteQuality:(NSArray<TRTCQualityInfo *> *)remoteQuality {
if (remoteQuality.count > 0) {
switch(remoteQuality[0].quality) {
case TRTCQuality_Unknown:
NSLog(@"Undefined ");
break;
case TRTCQuality_Excellent:
NSLog(@"The other party's network is very good");
break;
case TRTCQuality_Good:
NSLog(@"The other party's network is quite good");
break;
case TRTCQuality_Poor:
NSLog(@"The other party's network is average");
break;
case TRTCQuality_Bad:
NSLog(@"The other party's network is relatively poor");
break;
case TRTCQuality_Vbad:
NSLog(@"The other party's network is very poor");
break;
case TRTCQuality_Down:
NSLog(@"The other party's network is extremely poor");
break;
default:
break;
}
}
}
Note:
localQuality
's userId field is empty, indicating the local user network quality assessment result.
remoteQuality
represents the assessment result of the remote user's network quality, which is influenced by factors on both the remote and local sides.
Call duration statistics
It is recommended to use the time when a remote user joins the TRTC (Tencent Real-Time Communication) room as the start time for calculating call duration, and the time when the local user exits the room as the end time for calculating call duration.
long callStartTime = 0;
long callFinishTime = 0;
long callDuration = 0;
@Override
public void onRemoteUserEnterRoom(String userId) {
callStartTime = System.currentTimeMillis();
}
@Override
public void onExitRoom(int reason) {
callFinishTime = System.currentTimeMillis();
callDuration = (callFinishTime - callStartTime) / 1000;
}
@property (nonatomic, assign) NSTimeInterval callStartTime;
@property (nonatomic, assign) NSTimeInterval callFinishTime;
@property (nonatomic, assign) NSInteger callDuration;
- (void)onRemoteUserEnterRoom:(NSString *)userId {
self.callStartTime = [[NSDate date] timeIntervalSince1970];
}
- (void)onExitRoom:(NSInteger)reason {
self.callFinishTime = [[NSDate date] timeIntervalSince1970];
self.callDuration = (NSInteger)(self.callFinishTime - self.callStartTime);
}
Note:
In cases of exceptions such as forced closure or network disconnection, the client may not be able to log the relevant times. These can be monitored through Server-side Event Callback to track events of entering and exiting the room and calculate the duration of the call. Advanced Features
On-Cloud Recording
In many scenes of 1V1 Audio and Video Call, it is necessary to record and store the content of the call for filing and post-event analysis. TRTC (Tencent Real-Time Communication)'s latest upgrade to on-cloud recording, which doesn't rely on CSS (Cloud Streaming Services) capabilities and doesn't require rerouting to CSS, uses TRTC (Tencent Real-Time Communication)'s internal real-time recording cluster for audio and video recording, offering a more complete and unified recording experience.
Single Stream Recording:Through TRTC (Tencent Real-Time Communication)'s on-cloud recording feature, you can record the audio and video streams of both parties in the room into separate files.
Mixed Stream Recording: Record all the audio and video media streams in the same room into one file.
Note:
For a detailed introduction and activation guide to TRTC On-Cloud Recording, see On-Cloud Recording. Video Beauty Effects
In video call scenes, beauty effects are a frequently used feature. Not only can beauty effects enhance the user's appearance, but they also add interest to the call interaction through various sticker effects. TRTC (Tencent Real-Time Communication) supports the integration of Tencent Beauty Special Effects and also supports the connect to mainstream third-party beauty products in the market, such as Volcano Beauty, Xiangxin Beauty, etc. Beauty Enhancement Connect Process
API Call Sequence
Comparison of Beauty Enhancement Products
|
| The basic effect is good, advanced effect for big eyes/slim faces is significant. | Moderately Low | Moderate | Supported | Android/iOS/PC/Flutter/Web/Mini Program |
| The basic effect is good, advanced effects like big eyes/slim faces are average. | Moderately High | Moderate | Supported | Android/iOS/PC/Untiy |
| The basic effect is good, advanced effects like big eyes/slim faces are relatively good. | Moderately High | Relatively High | Supported | Android/iOS/PC/Linux |
Offline Message Push
In Audio/Video Call scenes, the offline message push feature is usually necessary, allowing the called user's App to receive new incoming call messages even when it's not online.
1. Register your application with vendor push platforms.
2. Configure the IM console.
3. Configure the redirected-to page for offline push.
4. Configure vendor push rules.
5. Integrate the vendor push SDK.
6. Sync frontend and backend status.
7. Send offline push messages.
8. Parse offline push messages.
1. Apply for an APNs/VoIP Push certificate.
2. Upload the certificate to the IM console.
3. The app requests a token from Apple's backend.
4. Log in to the IM SDK and then upload the token to Tencent Cloud.
5. Send offline push messages.
6. Parse offline push messages.
Supporting Products for the Solution
|
Access Layer | | Provides low-latency, high-quality real-time audio and video interaction solutions, which are the basic infrastructure capabilities for Audio/Video Call scenes. |
Access Layer | | Provides reliable and stable signaling transmission, custom message sending and receiving, to implement call signaling control in Audio/Video Call scenes. |
Access Layer | | Provides real-time effects processing capabilities such as beauty, filtering, makeup, fun stickers, emojis, and virtual avatars. |
Cloud Services | | Aimed at audio, video, and images, it provides an all-in-one high-quality media service including production upload, storage, transcoding, MPS (Media Processing Service), media AI, accelerated distribution and playback, and copyright protection. |
Data Storage | | Provides storage services for audio and video recording files, as well as audio and video slicing files. |
Was this page helpful?