tencent cloud

All product documents
TDMQ for Apache Pulsar
Cross-Region Disaster Recovery Practices
Last updated: 2024-12-02 17:25:44
Cross-Region Disaster Recovery Practices
Last updated: 2024-12-02 17:25:44

Cross-Region Disaster Recovery

Message middleware is a vital component in the technical architecture of business systems. While TDMQ for Apache Pulsar already supports disaster recovery across multiple availability zones, it introduces the Cross-region disaster recovery solution to address region-level disasters. This solution enables customers to quickly migrate their business operations, ensuring uninterrupted continuity.
The following document provides an overview of the cross-region disaster recovery solution.

Under normal circumstances, business operations in Region A access the Pulsar server. Users need to complete two main actions:
1. Establish cross-city network connectivity using Cloud Connect Network (CCN) to enable cross-region VPC communication.
2. Synchronize metadata between the two regions via the Pulsar console, including namespaces, Topics, subscriptions, and roles.
When an exception occurs, the TDMQ for Apache Pulsar console provides a domain name parsing switch feature. This feature redirects the domain name originally used in Region A to the disaster recovery cluster in Region B. This avoids the need for clients to modify access point addresses, enabling seamless access to the Region B cluster and ensuring business continuity.
Once the exception in Region A is resolved, users need to determine whether to write back the messages generated in Region B to Region A to ensure message integrity. If a write-back is needed, please contact our after-sales team for assistance. Afterward, users can switch the access point domain name parsing back to the Region A cluster from Region B. Once the switch is completed, clients can resume normal access to Region A.

Operation Guide

Configuring Disaster Recovery Features

1. In the backup region, create a professional cluster. On the cluster purchase page, enable the Cross-region Replication switch and select the cluster to be backed up;
2. Configure the cluster metadata synchronization linkage through the console:
Replication linkage name: Define a name for the synchronization linkage.
Linkage type: Select metadata.
Source cluster selection: Choose the Pulsar cluster for disaster recovery backup.
Target cluster selection: Select the pre-created disaster recovery cluster in a different region. Only clusters with the same cluster ID will be displayed.
Replication level: Choose between cluster-level and namespace-level replication.
Cluster level: Suitable for cold backups at the cluster level.
Namespace Level: Suitable for scenarios where clusters in both regions are actively used, with different namespaces distributed across regions. Regions act as mutual primary and backup for each other.

Establishing CCN

Use Cloud Connect Network to link the production region and the backup region, creating a network access channel. This ensures that, in the event of a disaster, clients in the production region can access the backup cluster across regions.
For detailed configuration steps, see CCN Operation Guide and perform the following operations:

When Disaster Occurs

Users can decide to switch client access to the backup region:
1. If the console is available: Initiate a domain name parsing switch via the console;
2. If the console is unavailable: Contact the after-sales architect to request a switch, which will be initiated by the TDMQ service.

After Disaster Recovery

Users can decide to switch client access back to the original region cluster:
1. Evaluate whether messages need to be written back to the original region. If write-back is required, contact our after-sales team for assistance.
2. Initiate a domain name switch-back via the console to restore normal client access to the original region.

Notes

1. Supported Scope

This feature is supported only in professional clusters.

2. Message Write-Back

Message write-back is a prerequisite assessment when switching traffic back to the original region. It aims to prevent data loss and ensure data integrity. Be sure to decide whether to perform a write-back before initiating the domain name switch-back.
User-provided information:
The list of Topics to be migrated, including details such as cluster ID, namespace, or specific Topic lists.
The start and end time. Messages sent within this time range, based on the publishTime field in the message header, will be identified as data to be migrated.
Impacts of message write-back:
A large number of duplicate messages may occur. The server does not account for the complex state machine of offset synchronization between the source and target clusters. All migrated messages are treated as new messages, even if identical messages already exist in the historical data. They will be regarded as separate messages. If duplicate messages impact your business, it is recommended to implement idempotent processing on the client side.
A small number of messages may arrive out of order.

3. About Roles

The source cluster should have at least one Role, which does not need to be bound to a namespace. This ensures that during synchronization, the Role and Token remain consistent with the disaster recovery cluster.

4. CCN Configuration

When you configure CCN, the VPC CIDRs of the two regions should not overlap. For example, use 10.0.0.0/16 for Guangzhou and 10.1.0.0/16 for Shanghai. This ensures that CCN can link the two VPCs without IP conflicts.

5. Domain Name Switch Effectiveness Time

The domain name switch takes approximately 5 seconds to 5 minutes to become effective. This duration includes two parts: domain name parsing switch and client disconnection and reconnection to the new cluster’s Broker.

6. Post-Switch Actions During a Disaster

After traffic is switched to the disaster recovery cluster during a disaster, avoid making metadata changes on the backup cluster, such as modifying namespace attributes or creating Topics.

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 available.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon