Caching is the cheapest performance win in most APIs and one of the easiest to get subtly wrong. The hard part is rarely storing a value; it is choosing the right layer, picking sensible expirations, and invalidating stale data without serving someone else's. .NET 8 gives you three distinct caching tools, each suited to a different job.
By the end of this post you will understand when to use in-memory caching, distributed caching with Redis, and HTTP response caching, and you will have correct, idiomatic code for all three, including the cache-stampede protection most tutorials leave out.
Before starting, make sure you have:
- .NET 8 SDK and an existing ASP.NET Core Web API project
- A Redis instance for the distributed caching section (a local Docker container works)
- The
Microsoft.Extensions.Caching.StackExchangeRedispackage for that same section
IMemoryCache stores objects in the process's own memory. It is the fastest option because there is no network hop and no serialization, which makes it ideal for small, frequently read, expensive-to-compute data on a single instance.
The naive version and its flaw
A first attempt usually looks like a get-or-create. It works, but under load it has a hidden problem.
// Cache-aside: check cache, fall back to source, store result
public async Task<Product?> GetProductAsync(int id)
{
return await _cache.GetOrCreateAsync($"product:{id}", async entry =>
{
entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10);
entry.SlidingExpiration = TimeSpan.FromMinutes(2);
return await _db.Products.FindAsync(id);
});
}
The flaw is the cache stampede: when a popular key expires, many concurrent requests all miss at once and all hit the database simultaneously. On a hot key under traffic, that thundering herd can knock over the very database the cache was meant to protect.
Protecting against the stampede
A per-key lock ensures only one request rebuilds the value while the rest wait for it.
private readonly SemaphoreSlim _gate = new(1, 1);
public async Task<Product?> GetProductSafeAsync(int id)
{
var key = $"product:{id}";
if (_cache.TryGetValue(key, out Product? cached))
return cached;
await _gate.WaitAsync();
try
{
// Double-check: another thread may have populated it while we waited.
if (_cache.TryGetValue(key, out cached))
return cached;
var product = await _db.Products.FindAsync(id);
_cache.Set(key, product, TimeSpan.FromMinutes(10));
return product;
}
finally
{
_gate.Release();
}
}
The double-checked pattern inside the lock is what makes this correct: by the time a waiting thread acquires the gate, the value is usually already cached, so it returns immediately without a redundant query.
In-memory caching breaks down the moment you scale to more than one instance: each server has its own cache, so hit rates drop and invalidation becomes inconsistent. A distributed cache like Redis gives every instance a single shared store.
// Program.cs
builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = builder.Configuration.GetConnectionString("Redis");
options.InstanceName = "yourapp:";
});
Because Redis stores bytes, you serialize on the way in and deserialize on the way out. Wrapping that in a small helper keeps call sites clean.
public async Task<T?> GetOrSetAsync<T>(
string key, Func<Task<T>> factory, TimeSpan ttl)
{
var cached = await _distributedCache.GetStringAsync(key);
if (cached is not null)
return JsonSerializer.Deserialize<T>(cached);
var value = await factory();
await _distributedCache.SetStringAsync(
key,
JsonSerializer.Serialize(value),
new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = ttl
});
return value;
}
The tradeoff is real: Redis adds a network round trip and serialization cost, so it is slower per call than in-memory. You accept that latency in exchange for a cache that is consistent across every instance and survives an individual app restart.
The two layers above cache data inside your code. Response caching works at the HTTP level, instructing clients, proxies, and the framework to reuse a whole response. It is the most efficient option when an entire endpoint's output is cacheable, because a cache hit can skip your action method entirely.
// Program.cs
builder.Services.AddResponseCaching();
var app = builder.Build();
app.UseResponseCaching();
[HttpGet("categories")]
[ResponseCache(Duration = 300, Location = ResponseCacheLocation.Any, VaryByQueryKeys = ["page"])]
public IActionResult GetCategories(int page = 1)
{
return Ok(_catalog.GetCategories(page));
}
VaryByQueryKeys is essential here: without it, the cache could serve page 1's data for a page 2 request. Response caching only applies to GET and HEAD requests that return a success status and carry no Authorization header by default, which is exactly right for public, anonymous, read-heavy endpoints.
AddOutputCache), a more powerful server-side cache with tag-based invalidation. Response caching is the standards-based HTTP option; output caching gives you finer control. For new projects, evaluate output caching for server-controlled scenarios.
The decision comes down to scope and freshness. Use in-memory for single-instance, ultra-hot, small data. Use Redis when you run multiple instances and need a shared, consistent cache. Use response or output caching when an entire endpoint response is reusable and you want to skip the work altogether.
// A quick way to prove caching works: log and time the source call
public async Task<Product?> GetProductAsync(int id)
{
return await GetOrSetAsync($"product:{id}", async () =>
{
_logger.LogInformation("CACHE MISS for product {Id}", id);
return await _db.Products.FindAsync(id);
}, TimeSpan.FromMinutes(10));
}
Call the endpoint twice and you should see exactly one "CACHE MISS" log line, with the second response served from cache and noticeably faster. If you see the miss on every call, your key is varying when it should not (a common cause is including a timestamp or request-specific value in the key).
You now have three caching layers and a clear rule for each: in-memory for raw speed on one box, Redis for consistency across many, and response or output caching to skip work entirely. Just as importantly, you have stampede protection and graceful degradation, the details that separate a cache that helps from one that becomes a liability.
The final post in this series turns to observability, so you can actually see what your cached, authenticated, job-running API is doing in production: structured logging with Serilog and a PostgreSQL sink.
No comments: