Thursday, September 12, 2024

How Can You Test and Revert a Failover in Azure Cache for Redis Premium with Geo-Replication?

If you're using Azure Cache for Redis Premium with geo-replication across different regions, knowing how to test failover and revert it is essential to ensure your infrastructure can handle disasters seamlessly. In this article, we'll walk you through how to perform a manual failover test and revert it back using Azure Portal—helping you maintain operational stability between your cache instances in different regions.

Why Failover and Reverting Is Important?

Azure Cache for Redis Premium supports geo-replication, allowing you to replicate data between cache instances in different Azure regions. By testing the failover process, you can simulate disaster recovery scenarios where one region goes offline, and the other takes over automatically. Reverting back after a failover ensures that your original setup is restored without disruption once the primary region is available again.

In our case, we have two Redis caches:

  • Primary cache located in Central US.
  • Secondary cache located in East US 2.

Both caches are using Private Link endpoints to ensure secure and isolated connectivity.

Step 1: Testing the Failover in Azure Cache for Redis

Testing failover allows you to simulate a disaster recovery scenario where the secondary cache becomes the primary. Here’s how to do this through the Azure Portal:

1.1 Navigate to the Primary Redis Cache

  • Log into the Azure portal.
  • Search for Azure Cache for Redis and select your primary Redis cache (in Central US).

1.2 Access Geo-Replication Settings

  • Under the Settings section of the Redis cache, select Geo-replication.
  • You will see your secondary Redis cache (in East US 2) listed as a replica.

1.3 Initiate the Failover

  • Click on the secondary Redis instance to open the details.
  • At the top of the page, click Initiate Failover.

Once the failover is initiated, the secondary cache in East US 2 will become the new primary. This process typically takes a few minutes, and during that time, the new primary starts accepting traffic while the original primary goes into a secondary state.

1.4 Testing the New Primary

After the failover, test the new primary cache (East US 2) by writing and reading data to ensure that it's functioning correctly:

redis-cli -h <new-primary-private-endpoint> -a <new-primary-access-key> SET "test:key" "Data after failover" 
redis-cli -h <new-primary-private-endpoint> -a <new-primary-access-key> GET "test:key"

This verifies that the new primary cache is fully operational.


Step 2: Reverting the Failover to the Original Primary

Once the failover test is complete, you'll want to revert the setup to its original configuration, with Central US as the primary cache. Here’s how you can do that:

2.1 Break the Existing Geo-Replication

  • Go to the new primary Redis cache (in East US 2).
  • Navigate to Geo-replication under the Settings section.
  • Click on the original primary cache (Central US) that is now secondary and select Unlink to break the geo-replication.

2.2 Re-establish the Original Setup

  • Go to the original primary Redis cache (in Central US), which is now secondary.
  • Select Geo-replication and click Add Replication.
  • Choose the East US 2 Redis cache (the new primary) as the replica, and click Create.

This step promotes the Central US cache back to the primary, with the East US 2 cache becoming secondary again.

2.3 Verify and Test

Once geo-replication is re-established, write and read data from the Central US cache (the original primary) to confirm that the revert has worked successfully:


redis-cli -h <central-us-private-endpoint> -a <central-us-access-key> SET "revert-test:key" "Reverted to original primary" 
redis-cli -h <central-us-private-endpoint> -a <central-us-access-key> GET "revert-test:key"

Also, check the data consistency in the secondary cache (East US 2) to ensure replication is working:

redis-cli -h <east-us-2-private-endpoint> -a <east-us-2-access-key> GET "revert-test:key"

Conclusion

Testing and reverting failover in Azure Cache for Redis Premium is a critical part of maintaining a high-availability, disaster-ready infrastructure. By following the steps outlined, you can easily test failover scenarios and revert back to the original configuration—all within the Azure Portal.

Both the failover and revert processes ensure that your application remains resilient to regional outages, and help maintain data integrity and availability in your distributed systems. Make sure to monitor the health and performance of your Redis instances during the failover and reversion process to ensure optimal performance.

No comments:

Post a Comment

How Can You Test and Revert a Failover in Azure Cache for Redis Premium with Geo-Replication?

If you're using Azure Cache for Redis Premium with geo-replication across different regions, knowing how to test failover and revert it ...