tencent cloud

Feedback

Data Subscription Issues

Last updated: 2024-09-09 21:34:48

    MySQL Exception Handling

    Configuration and Startup Errors

    The following situations may cause the connector to fail to start, report errors or exceptions in the log, and then stop running:
    The configuration of the connector is invalid.
    The connector cannot connect to the MySQL server based on the provided configuration parameters.
    The connector attempts to recover from the failure point after a restart, but MySQL's binlog has been cleaned and the corresponding historical records are unavailable.
    When these situations occur, the error message will provide details of the error and may offer some troubleshooting methods. After troubleshooting, you can try restarting the connector service.

    MySQL Becomes Unavailable

    If the MySQL service becomes unavailable, the connector will stop working. You need to restart the connector service when MySQL becomes available. If your MySQL cluster uses the GTIDs protocol, you can immediately restart the connector service; it will connect to another server in the cluster and continue reading the binlog from the last committed transaction.

    Kafka Connect Exceptional Termination

    If Kafka Connect is running in distributed mode and the Kafka Connect process stops normally, Kafka Connect will migrate the connector tasks from that process to another Kafka Connect process in the group before shutting down the process. The new connector tasks will accurately continue processing from where the previous tasks stopped. There will be a brief delay in processing when the connector tasks stop normally and restart on the new process.

    Kafka Connect Crash

    When Kafka Connect crashes, the process stops immediately and does not have time to commit the most recent offset. In a distributed deployment environment, Kafka Connect will restart a new process, but the new process cannot obtain the latest offset from the crashed process, which may result in duplicate submission of the same event.
    However, each change event message includes the connector's metadata information, which you can use to identify duplicate event submissions.

    Kafka Service Becomes Unavailable

    When the Kafka service becomes unavailable, the Debezium MySQL connector will pause until it re-establishes connection with the Kafka service.

    MySQL Binlog Cleanup

    If the Debezium MySQL connector stops for too long, the MySQL server may clean up the binlog file, causing the latest position read by the connector to be cleaned up. When the connector restarts and finds that the original read position has been cleaned, it will attempt to reinitialize the snapshot. If the snapshot is disabled, the connector will terminate exceptionally.

    PostgreSQL Exception Handling

    Configuration and Startup Errors

    The following situations may cause the connector to fail to start, report errors or exceptions in the log, and then stop running:
    The configuration of the connector is invalid.
    The connector cannot connect to the PostgreSQL server according to the provided configuration parameters.
    The connector attempts to resume reading from the record offset at the time of the failure when it restarts, but PostgreSQL has already cleaned up the related records.
    When these situations occur, the error message will provide details of the error and may offer some troubleshooting methods. After troubleshooting, you can try restarting the connector service.

    PostgreSQL Becomes Unavailable

    If the PostgreSQL service becomes unavailable, the connector will stop working and will need to be restarted once PostgreSQL is available.

    Cluster Failures

    Since version 12 is released, PostgreSQL only allows setting up logical replication slots on the primary server. This means you can only point the Debezium PostgreSQL connector to the database cluster's primary server. Moreover, replication slots themselves are not propagated to replicas. If the primary server fails, a new primary server should be elected. The new primary server should have the logical decoding plug-in installed, configured for the replication slots used by the plugin, and the database to capture change events should be running. Only then can you reconfigure the connector to connect to the new server and restart the connector. There are some important warnings during failover: You should pause the Debezium service, and restart the service after verifying that there is a complete replication slot and no data loss. After failover:
    Before the application to write to the new primary node is allowed, there should be a process to recreate the Debezium replication slot; otherwise, the application might miss change events.
    You may need to verify whether Debezium can read all changes from the slot before the old primary node is terminated.
    A reliable method to restore and verify if any change events were lost is to restore the backup of the failed primary node to the position before the failure. Although this might be difficult to execute, it allows you to check if any unconsumed changes remain in the replication slot.

    Kafka Connect Exceptional Termination

    If Kafka Connect is running in distributed mode and the Kafka Connect process stops normally, Kafka Connect will migrate the connector tasks from that process to another Kafka Connect process in the group before shutting down. The new connector tasks will accurately continue processing from where the previous tasks stopped. There will be a brief delay in processing when the connector tasks stop normally and restart on the new process.

    Kafka Connect Crash

    If the Kafka connector process stops unexpectedly, all the tasks running on the connector will be terminated, and their recently processed offsets will not be logged. When Kafka Connect is running in distributed mode, Kafka Connect will restart these connector tasks on other processes. However, the PostgreSQL connector recovers from the last offset recorded by the earlier process. This means the new replacement tasks may generate some of the same change events processed before the crash. The number of duplicate events depends on the offset refresh cycle and data changes before the crash. During each change event recording, the Debezium connector logs metadata information related to event origin, including PostgreSQL server time and the ID of the server transaction when the event occurred. Consumers can track this information, especially the LSN, to determine if an event is duplicated.

    Kafka Service Becomes Unavailable

    When the Kafka service becomes unavailable, the Debezium PostgreSQL connector will pause and retry the connection until it re-establishes with the Kafka service.

    Connector Stopped Service for a Period

    If the connector stops normally, the database can continue to be used. Any changes will be recorded in the PostgreSQL WAL. When the connector restarts, it will resume committing changes from where it left off. This means it will generate change event records for all database changes made while the connector was stopped.
    A properly configured Kafka cluster can handle data with very high throughput. Kafka Connect is written according to Kafka best practices and, with sufficient resources, the Kafka Connect connector can also handle a large number of database change events. Therefore, after being stopped for a period of time, when the Debezium connector restarts, it is likely to catch up on the changes that occurred in the database during the downtime.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support