Designing High-Availability Systems with .NET and Redis

Tapesh Mehta Tapesh Mehta | Published on: Feb 20, 2026 | Est. reading time: 10 minutes
Designing High-Availability Systems with .NET and Redis

Modern production systems cannot afford downtime. Whether you are running an e-commerce platform, a financial service, or a SaaS application, your users expect 99.9% or better uptime. Designing high-availability systems with .NET and Redis is one of the most practical approaches to achieving this goal. Redis provides lightning-fast, in-memory data access, distributed locking, and pub/sub capabilities — all of which are essential building blocks for resilient architectures. In this article, we will walk through the core patterns and code required to design a genuinely fault-tolerant .NET backend powered by Redis.

Table of Contents

Why High Availability Requires More Than Just a Good Server

High availability (HA) is not about buying the biggest server — it is about designing your system so that no single component failure causes the entire application to go down. This means eliminating single points of failure at every layer: application servers, databases, caches, and message brokers. Teams building high-availability systems with .NET and Redis typically focus on three pillars: redundancy, fast failover, and graceful degradation.

If you are working on scaling up a large backend, you may also find value in understanding lessons from handling millions of users with .NET, which covers real-world scaling decisions that complement the caching strategies we will discuss here.

The Role of Redis in High-Availability Architecture

Redis is not just a cache — in HA systems, it serves as a distributed coordination layer. It handles distributed locks (preventing race conditions across multiple app instances), session storage (so any app node can serve any user), rate limiting, pub/sub messaging, and leader election. When configured with Redis Sentinel or Redis Cluster, it also eliminates the cache tier as a single point of failure. According to the official Redis Sentinel documentation, Sentinel provides automatic failover, monitoring, and notification capabilities that make Redis suitable for production HA deployments.

Setting Up Redis with StackExchange.Redis in .NET

The most widely used Redis client for .NET is StackExchange.Redis. It supports connection multiplexing, reconnection logic, and Redis Cluster natively. Here is how to register a properly configured Redis connection in an ASP.NET Core application:


// Program.cs
builder.Services.AddSingleton<IConnectionMultiplexer>(sp =>
{
    var config = ConfigurationOptions.Parse(
        builder.Configuration.GetConnectionString("Redis")!);

    config.AbortOnConnectFail = false;   // Don't throw on startup if Redis is down
    config.ConnectRetry = 5;
    config.ReconnectRetryPolicy = new ExponentialRetry(5000);
    config.SyncTimeout = 5000;
    config.AsyncTimeout = 5000;

    return ConnectionMultiplexer.Connect(config);
});

builder.Services.AddScoped<IDatabase>(sp =>
    sp.GetRequiredService<IConnectionMultiplexer>().GetDatabase());

Setting AbortOnConnectFail = false is critical in HA deployments — your application should start even if Redis is momentarily unavailable, and attempt reconnection in the background. Using ExponentialRetry prevents thundering herd issues when Redis becomes available again after an outage.

Redis Connection String for Sentinel

When connecting to a Redis Sentinel cluster, your connection string needs to reference the Sentinel endpoints, not the Redis master directly. This way, StackExchange.Redis will automatically discover the current master and fail over to a replica if needed:


{
  "ConnectionStrings": {
    "Redis": "sentinel1:26379,sentinel2:26379,sentinel3:26379,serviceName=mymaster,password=yourpassword"
  }
}

Implementing Distributed Caching with Fallback

One of the most important patterns in high-availability systems with .NET and Redis is the cache-aside pattern combined with a fallback to the primary data source. If Redis becomes unavailable, your system should degrade gracefully rather than throwing errors to users.


public class ResilientCacheService
{
    private readonly IDatabase _redis;
    private readonly ILogger<ResilientCacheService> _logger;
    private static readonly TimeSpan DefaultTtl = TimeSpan.FromMinutes(10);

    public ResilientCacheService(IDatabase redis, ILogger<ResilientCacheService> logger)
    {
        _redis = redis;
        _logger = logger;
    }

    public async Task<T?> GetOrSetAsync<T>(
        string key,
        Func<Task<T>> factory,
        TimeSpan? ttl = null) where T : class
    {
        try
        {
            var cached = await _redis.StringGetAsync(key);
            if (cached.HasValue)
                return JsonSerializer.Deserialize<T>(cached!);
        }
        catch (RedisException ex)
        {
            _logger.LogWarning(ex, "Redis unavailable for key {Key}. Falling back to data source.", key);
        }

        // Cache miss or Redis unavailable — hit the source
        var result = await factory();

        try
        {
            if (result != null)
            {
                await _redis.StringSetAsync(
                    key,
                    JsonSerializer.Serialize(result),
                    ttl ?? DefaultTtl);
            }
        }
        catch (RedisException ex)
        {
            _logger.LogWarning(ex, "Failed to write key {Key} to Redis.", key);
        }

        return result;
    }
}

Wrapping each Redis operation in a try/catch for RedisException ensures that a cache failure never surfaces as an unhandled exception to your users. This pattern is at the heart of resilient, high-availability systems with .NET and Redis.

Distributed Locking to Prevent Race Conditions

When multiple instances of your .NET application run in parallel (as they should in any HA deployment), race conditions become a real risk. Distributed locking with Redis ensures that only one instance performs a critical operation at a time — such as processing a payment, regenerating a large report, or seeding a cache.


public class RedisDistributedLock
{
    private readonly IDatabase _redis;

    public RedisDistributedLock(IDatabase redis)
    {
        _redis = redis;
    }

    public async Task<bool> AcquireLockAsync(
        string resource,
        string lockToken,
        TimeSpan expiry)
    {
        // SET NX EX atomically — only set if not exists
        return await _redis.StringSetAsync(
            $"lock:{resource}",
            lockToken,
            expiry,
            When.NotExists);
    }

    public async Task ReleaseLockAsync(string resource, string lockToken)
    {
        // Only release if we own the lock (Lua script for atomicity)
        const string script = @"
            if redis.call('get', KEYS[1]) == ARGV[1] then
                return redis.call('del', KEYS[1])
            else
                return 0
            end";

        await _redis.ScriptEvaluateAsync(
            script,
            new RedisKey[] { $"lock:{resource}" },
            new RedisValue[] { lockToken });
    }
}

The Lua script for releasing the lock is critical — it ensures that a lock is only released by the process that acquired it, even if the lock nearly expired. For production use, consider the RedLock.net library which implements the Redlock algorithm across multiple independent Redis nodes for even stronger guarantees.

Integrating Health Checks for Redis

No high-availability system is complete without proper health checks. ASP.NET Core’s health check middleware can monitor your Redis connection and report its status to load balancers and orchestration platforms like Kubernetes. This integrates seamlessly with health checks in ASP.NET Core and load balancer integration, ensuring that unhealthy instances are automatically removed from the rotation.


// Program.cs
builder.Services.AddHealthChecks()
    .AddRedis(
        builder.Configuration.GetConnectionString("Redis")!,
        name: "redis",
        failureStatus: HealthStatus.Degraded,
        tags: new[] { "cache", "infrastructure" });

// In app pipeline:
app.MapHealthChecks("/health", new HealthCheckOptions
{
    ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = check => check.Tags.Contains("infrastructure")
});

Using HealthStatus.Degraded instead of Unhealthy for Redis means the app continues to serve traffic even when Redis is down — reflecting our graceful degradation strategy. The load balancer only removes an instance from rotation when it reports Unhealthy.

Session Affinity vs. Centralized Session with Redis

In single-server deployments, ASP.NET Core stores sessions in-process. In HA deployments with multiple nodes behind a load balancer, you have two choices: sticky sessions (session affinity) or centralized session storage. Sticky sessions couple a user to a specific server — if that server goes down, the session is lost. Centralized session storage using Redis decouples sessions from any individual node, making it the correct choice for genuine high availability.


// Program.cs
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = builder.Configuration.GetConnectionString("Redis");
    options.InstanceName = "MyApp_Session_";
});

builder.Services.AddSession(options =>
{
    options.IdleTimeout = TimeSpan.FromMinutes(30);
    options.Cookie.HttpOnly = true;
    options.Cookie.IsEssential = true;
    options.Cookie.SecurePolicy = CookieSecurePolicy.Always;
});

// Add middleware
app.UseSession();

With this configuration, any node in your cluster can serve any user’s request without needing to know which node that user connected to previously. This is a foundational requirement for high-availability systems with .NET and Redis.

Event-Driven Resilience with Redis Pub/Sub

For real-time cache invalidation and event broadcasting across nodes, Redis pub/sub is a lightweight solution that avoids the operational overhead of a full message broker for internal communication. When a record is updated in your database, you can publish an invalidation event so all nodes drop that cached entry simultaneously.


// Publisher (on data update)
public async Task InvalidateCacheAsync(string entityKey)
{
    var subscriber = _redis.Multiplexer.GetSubscriber();
    await subscriber.PublishAsync("cache:invalidate", entityKey);
}

// Subscriber setup (on startup)
var subscriber = connectionMultiplexer.GetSubscriber();
await subscriber.SubscribeAsync("cache:invalidate", async (channel, key) =>
{
    await _redis.KeyDeleteAsync((string)key!);
    _logger.LogInformation("Cache key {Key} invalidated via pub/sub.", key);
});

This pattern complements more complex event-driven architectures. If you are building a system that requires durable messaging beyond what Redis pub/sub provides, consider reading about event-driven architecture with .NET and Azure Service Bus for guaranteed delivery semantics.

Applying the Circuit Breaker Pattern

Even with graceful fallbacks at the code level, you need a circuit breaker to avoid cascading failures when Redis is consistently unavailable. The Polly library is the standard way to implement this in .NET, and it works natively with your Redis service calls.


// Register a circuit breaker for Redis operations
builder.Services.AddSingleton<IAsyncPolicy>(sp =>
    Policy
        .Handle<RedisException>()
        .Or<RedisTimeoutException>()
        .CircuitBreakerAsync(
            exceptionsAllowedBeforeBreaking: 3,
            durationOfBreak: TimeSpan.FromSeconds(30),
            onBreak: (ex, breakDuration) =>
            {
                var logger = sp.GetRequiredService<ILogger<Program>>();
                logger.LogError(ex,
                    "Redis circuit breaker OPEN for {Duration}s",
                    breakDuration.TotalSeconds);
            },
            onReset: () =>
            {
                var logger = sp.GetRequiredService<ILogger<Program>>();
                logger.LogInformation("Redis circuit breaker CLOSED — connection restored.");
            }));

When the circuit breaker is open, calls to Redis are short-circuited immediately without waiting for a timeout — this prevents thread exhaustion and keeps your application responsive. After 30 seconds, the circuit moves to half-open and tests if Redis is available again.

Designing Scalable Multi-Tenant Caching Strategies

If you are building a multi-tenant SaaS product, your caching strategy must account for tenant isolation. Each tenant’s cached data should be namespaced to prevent data leakage and allow per-tenant cache invalidation. Building on patterns for scalable multi-tenant SaaS in .NET will help you design the broader application structure that this caching layer sits within.


public class TenantCacheService
{
    private readonly IDatabase _redis;
    private readonly IHttpContextAccessor _httpContextAccessor;

    public TenantCacheService(IDatabase redis, IHttpContextAccessor httpContextAccessor)
    {
        _redis = redis;
        _httpContextAccessor = httpContextAccessor;
    }

    private string TenantKey(string key)
    {
        var tenantId = _httpContextAccessor.HttpContext?
            .User.FindFirst("tenant_id")?.Value ?? "global";
        return $"tenant:{tenantId}:{key}";
    }

    public async Task SetAsync<T>(string key, T value, TimeSpan ttl)
        where T : class
    {
        var fullKey = TenantKey(key);
        await _redis.StringSetAsync(fullKey, JsonSerializer.Serialize(value), ttl);
    }

    public async Task InvalidateTenantCacheAsync(string tenantId)
    {
        // Use SCAN to find all keys for the tenant and delete them
        var server = _redis.Multiplexer.GetServer(
            _redis.Multiplexer.GetEndPoints().First());

        var keys = server.Keys(pattern: $"tenant:{tenantId}:*").ToArray();
        if (keys.Length > 0)
            await _redis.KeyDeleteAsync(keys);
    }
}

Monitoring and Cost Awareness in HA Redis Deployments

High-availability Redis deployments — especially on managed services like Azure Cache for Redis or AWS ElastiCache — carry real cost implications. Memory usage, the number of connections, and replication bandwidth all contribute to your monthly bill. Setting appropriate TTLs, using Redis key expiry policies (volatile-lru or allkeys-lru), and evicting stale data aggressively can significantly reduce memory consumption without sacrificing the benefits of caching. For more context on balancing performance and cost in .NET cloud deployments, see reducing cloud costs for .NET apps without sacrificing performance.

Always configure maxmemory and an eviction policy in your Redis instance to prevent out-of-memory crashes in production:


# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
save ""
appendonly no

Disabling persistence (save "" and appendonly no) is recommended when using Redis purely as a cache in an HA setup — persistence adds latency and is unnecessary if your application can rebuild the cache from the primary data source.

Conclusion

Designing high-availability systems with .NET and Redis requires deliberate decisions at every layer. Here are the key takeaways from this article:

  • Configure StackExchange.Redis with AbortOnConnectFail = false and ExponentialRetry to survive transient Redis outages.
  • Always wrap Redis calls in try/catch for RedisException and fall back to your primary data source.
  • Use Redis Sentinel or Redis Cluster to eliminate the cache tier as a single point of failure.
  • Implement distributed locking using atomic SET NX EX and Lua scripts for safe lock release.
  • Centralize sessions in Redis to enable true stateless, horizontally-scalable app nodes.
  • Add Redis health checks so load balancers can route traffic away from unhealthy instances.
  • Apply circuit breakers via Polly to prevent cascading failures when Redis is consistently unavailable.

The WireFuture team specialises in building resilient, enterprise-grade .NET backends. If you are evaluating your architecture or need expert hands-on implementation, explore our .NET and ASP.NET development services or reach out via cloud and DevOps consulting to discuss your high-availability requirements.

Share

clutch profile designrush wirefuture profile goodfirms wirefuture profile
A Global Team for Global Solutions! 🌍

WireFuture's team spans the globe, bringing diverse perspectives and skills to the table. This global expertise means your software is designed to compete—and win—on the world stage.

Hire Now

Categories
.NET Development Angular Development JavaScript Development KnockoutJS Development NodeJS Development PHP Development Python Development React Development Software Development SQL Server Development VueJS Development All
About Author
wirefuture - founder

Tapesh Mehta

verified Verified
Expert in Software Development

Tapesh Mehta is a seasoned tech worker who has been making apps for the web, mobile devices, and desktop for over 15+ years. Tapesh knows a lot of different computer languages and frameworks. For robust web solutions, he is an expert in Asp.Net, PHP, and Python. He is also very good at making hybrid mobile apps, which use Ionic, Xamarin, and Flutter to make cross-platform user experiences that work well together. In addition, Tapesh has a lot of experience making complex desktop apps with WPF, which shows how flexible and creative he is when it comes to making software. His work is marked by a constant desire to learn and change.

Get in Touch
Your Ideas, Our Strategy – Let's Connect.

No commitment required. Whether you’re a charity, business, start-up or you just have an idea – we’re happy to talk through your project.

Embrace a worry-free experience as we proactively update, secure, and optimize your software, enabling you to focus on what matters most – driving innovation and achieving your business goals.

Hire Your A-Team Here to Unlock Potential & Drive Results
You can send an email to contact@wirefuture.com
clutch wirefuture profile designrush wirefuture profile goodfirms wirefuture profile good firms award-4 award-5 award-6