TDSQL-A for PostgreSQL ensures the cluster disaster recovery capabilities in multiple dimensions:
Strong Sync Replication
TDSQL-A for PostgreSQL supports strong sync replication. Ensuring that the primary and standby nodes have the identical data is the basis of the entire disaster recovery system. If the primary node fails, the database service can be switched to the standby node without any data loss. The strong sync replication mechanism requires that a success be returned only after the user request is executed and the log is successfully written to the standby node, guaranteeing that the data on the primary and standby nodes are always consistent.
Primary/Standby High Availability
The primary/standby high availability solution of TDSQL-A for PostgreSQL mainly uses multi-replica redundancy in each node group to ensure that there are no or only momentary service interruptions. If the primary node in a group fails and cannot be recovered, a new primary node will be automatically selected from the corresponding standby nodes to continue service provision. Based on primary/standby high availability, TDSQL-A for PostgreSQL supports the following features:
1. Automated failover: if the primary node in the cluster fails, the system will automatically select a new primary node from the corresponding standby nodes, and the failed node will be automatically isolated. The strong sync replication policy ensures complete primary/standby data consistency in case of primary/standby failover, fully meeting the finance-grade requirements for data consistency.
2. Failure recovery: if a standby node loses data due to disk failure, the database admin (DBA) can recover the standby node by building it again or add a standby server to a new physical node to recover the primary/standby relationship and thus ensure the system reliability.
3. Replica switch: each node in the primary/standby architecture (which can contain one primary node and multiple standby nodes) has a complete data replica that can be switched to by the DBA as needed.
4. Do-Not-Switch configuration: it can be set that failover will not be performed during the specified period of time.
5. Cross-AZ deployment: even if the primary and standby nodes are in different data centers, the data can be replicated through Direct Connect in real time. If the local node is the primary and the remote node is the standby, the local node will be accessed first, and if it fails or becomes unreachable, the remote standby node will be upgraded to a primary node for service provision.
TDSQL-A for PostgreSQL supports the high availability solution based on strong sync replication. If the primary node fails, the system will automatically select the optimal standby node immediately to take over the tasks. The switch process is imperceptible to users, and the access IP remains unchanged. TDSQL-A for PostgreSQL offers 24/7 continuous monitoring for system components, and if there is a failure, it will automatically restart or isolate the failed node and select a new primary node from the standby nodes to continue service provision.