From a Slow Monolith to Account Statements in Under 5 Seconds
A trading platform's account statements — viewed on screen by tab, and exported as full PDF and Excel reports — ran on an ageing monolith with slow, legacy report generation, over data spread across several databases and services. We rebuilt the statement engine into .NET microservices that read every source in parallel, cache results in Azure Blob, and generate exports with modern libraries — returning most statements within five seconds.
The Problem
An account statement is the document a trading client relies on to understand what they hold, what they've earned, and what they've realised. On this platform it is consumed two ways: viewed on screen a tab at a time — summary, open positions, closed positions, dividends — and exported as a full PDF or Excel report covering every tab. The report export is the heart of the feature; it is what most users ultimately download.
The whole thing ran inside a monolith. A single codebase resolved the user, applied their regulatory context, pulled together data that did not all live in one place, and rendered the statement — and the PDF and Excel files were produced by old, slow export libraries. As the firm added instruments and clients year over year, generation times drifted upward, and a full report over a wide date range had become a genuinely long wait.
The data layout made it harder. The core figures — positions, dividends — live in the main database, but a complete statement also needs additional data held in other databases, plus reference data owned by other services. The monolith pulled these together one source after another, so the slowest set the pace for the whole statement.
The trigger for the work was a client-experience redesign, and it came with a hard requirement: most statements back within five seconds. The existing monolith could not meet that target — not for a full report, and increasingly not even for narrower ranges — and tuning it in place had long since reached the point of diminishing returns.
The Constraints
- A five-second response target. The client set a hard SLA: the large majority of statement requests had to return within five seconds, with a graceful path for the ones that could not — never a hung screen and never a timeout.
- Data spread across many sources. A statement composes the core Azure SQL database (positions and dividends — and, via a linked server, additional data in other databases) together with reference data from other microservices reached over their APIs. Reading everything sequentially was a primary cause of the slowness.
- Full reports versus single tabs. An on-screen tab reads only its own section and is cheap; a full PDF or Excel export reads every section and is an order of magnitude heavier. The design had to keep the common, light case fast without letting the heavy case block anything.
- Large cached payloads. A composed section — and certainly a full report — is a sizeable object. Redis is optimised for small values; these payloads were large enough that the cache needed a store suited to bigger objects.
- Unbounded historical ranges. The period selector runs from "this month" to a custom range covering a user's entire history — ten years or more. A full export over such a range cannot be guaranteed within five seconds.
The Architecture
Every request resolves the user's context — user ID, regulation, period, and whether it wants a single tab or a full report — and then attempts the Azure Blob cache before touching any source. On a miss, the statement service composes the data by reading in parallel from the core Azure SQL database — whose tuned stored procedures read the main tables and, through a linked server, additional data in the linked databases — and from the other microservices over their APIs. The composed result is written back to Blob, which doubles as the file the user downloads.
Two decisions did most of the work. The cache key is composed from user ID — never the username — regulation, period and tab, so identical requests share a blob and the second reader never recomputes what the first already paid for. And the cache lives in Azure Blob rather than Redis because a composed statement is a large object: Redis shines on small values, while these payloads were big enough that Blob was the better store — with the bonus that the same blob is the file the user downloads.
Implementation Highlights
Read the Cache by Trying, Not by Checking
The obvious way to read a cache is to check whether the entry exists and then download it. But against Blob Storage, an existence check costs the same round-trip as the download itself — so checking first makes every cache hit pay twice. Instead we just download by key and treat a "not found" exception as a miss. A hit is one round-trip; a miss costs an exception we were going to pay for anyway.
public sealed class StatementCache
{
private readonly BlobContainerClient _blobs;
public StatementCache(BlobContainerClient blobs) => _blobs = blobs;
// Key is the user ID (never the username), regulation, period and tab —
// everything that changes the output, so identical requests share a blob.
private static string KeyFor(StatementRequest r) =>
$"{r.UserId}/{r.Regulation}/{r.Period}/{r.Tab}.bin";
public async Task<StatementPart> GetOrAddAsync(
StatementRequest request,
Func<StatementRequest, Task<StatementPart>> compose,
CancellationToken ct)
{
var blob = _blobs.GetBlobClient(KeyFor(request));
try
{
// No ExistsAsync() first: that check is a whole round-trip, the
// same cost as the download — so on a hit we'd pay for two.
var hit = await blob.DownloadContentAsync(ct);
return hit.Value.Content.ToObjectFromJson<StatementPart>();
}
catch (RequestFailedException ex) when (ex.Status == 404)
{
// Miss — compose once, then persist for the next reader.
var fresh = await compose(request);
await blob.UploadAsync(
BinaryData.FromObjectAsJson(fresh), overwrite: true, ct);
return fresh;
}
}
}
Compose Every Source in Parallel
A statement is stitched together from the SQL database — whose stored procedures read the main tables and, via a linked server, the linked databases — and from other services. The monolith read these in sequence; we read them concurrently, so a section is only as slow as its slowest source rather than the sum of all of them. A full PDF or Excel report runs its sections in parallel too — and any section already cached from on-screen browsing is reused, so an export after a browse only queries what is genuinely missing.
// One section, composed in parallel from two app-level sources: the SQL
// stored procedure (which itself reads the main tables AND, via a linked
// server, the linked databases) and the other microservices' APIs.
public async Task<StatementPart> ComposeAsync(
StatementRequest r, CancellationToken ct)
{
var sql = _sql.LoadSectionAsync(r, ct); // main DB + linked server
var service = _api.GetReferenceDataAsync(r, ct); // other microservices
await Task.WhenAll(sql, service);
return StatementPart.Merge(sql.Result, service.Result);
}
// A full report is every tab — so the tabs run in parallel, and cached
// tabs (e.g. from on-screen browsing) cost nothing to reuse.
public Task<StatementPart[]> ComposeReportAsync(
StatementRequest r, CancellationToken ct) =>
Task.WhenAll(AllTabs.Select(tab =>
_cache.GetOrAddAsync(r with { Tab = tab }, ComposeAsync, ct)));
Generate in the Background, Race the Response Budget
Rather than decide up front whether a request is "big", every request is generated in the background and raced against the response budget — five seconds here, but read from central config (CCM) rather than hardcoded, so it can be retuned per environment without a redeploy. Beat the budget and the response is returned inline; miss it and the work simply keeps going, while the user is told the statement will arrive by email. The caller never blocks on a slow read, and nothing is thrown away — the same generation that missed the budget produces the emailed file.
public sealed class StatementResponder
{
private readonly IGenerator _generator;
private readonly IEmailSender _email;
private readonly ICcmSettings _ccm;
public StatementResponder(
IGenerator generator, IEmailSender email, ICcmSettings ccm)
{
_generator = generator;
_email = email;
_ccm = ccm;
}
public async Task<StatementResult> RespondAsync(
StatementRequest r, CancellationToken ct)
{
// The budget is an operational knob, not a constant — read it from CCM
// (central config) so it can be retuned per environment without a deploy.
var budget = TimeSpan.FromSeconds(
_ccm.GetDouble("Statement.ResponseBudgetSeconds"));
var work = _generator.GenerateAsync(r, ct);
// Beat the budget → return inline. Miss it → keep generating in the
// background and email a link, so the caller never waits on a big read.
if (await Task.WhenAny(work, Task.Delay(budget, ct)) == work)
return StatementResult.Ready(await work);
_ = work.ContinueWith(
t => _email.SendDownloadLinkAsync(r, t.Result),
TaskScheduler.Default);
return StatementResult.Preparing(
"Your statement is being prepared — we'll email a download link.");
}
}
The Outcome
The headline number — 95% of requests inside five seconds — was the client's requirement, but the change users felt was the everyday one. Reading every source in parallel instead of in sequence, serving repeat views from the Blob cache, and generating exports with a modern library together turned the common cases — a single tab, or a report for a period already browsed — into something that simply appeared rather than something you waited for.
The honest limit is still the data. It keeps growing as the firm lists more instruments and onboards more clients, and a full export over a multi-year custom range remains genuinely expensive to read. The five-second race turns that from a failure into a predictable experience — the user always gets their statement, by email if not on screen, and never sits in front of a frozen page — but the cost of the largest reads is the frontier any future work has to keep pushing back.
What We'd Do Differently
The Blob cache is reactive — it fills on the first request for a given user, period and tab. For predictable, common periods (this month, last quarter, year-to-date) we'd lean further into pre-generating those statements on a schedule, so they are always a cache hit rather than making the first user of each period pay the full composition.
We'd also formalise cache invalidation earlier. Keying by period and regulation is clean, but a late-arriving transaction or a correction against an already-closed period can leave a stale blob behind. We handled this case by case; with hindsight we'd define an explicit invalidation signal — when a transaction touches a period, evict the affected blobs — from the start.
The deepest constraint was always reading across the main database, the linked databases it reaches over a linked server, and the other services for every statement. Parallelism hid most of the latency, but the fan-out itself is fragile — any one source being slow drags the whole statement, and a linked-server hop has its own failure modes. A consolidated read model that projects the data a statement needs into one place would take the cross-source dependency off the hot path entirely, instead of racing it.
If You're Solving This Today
In 2026 we'd design that read model up front — a single projection a statement can be built from, kept current from the underlying sources — and reserve the cross-source parallel reads for the rare cases the projection doesn't cover. We'd keep Azure Blob as the cache and the download artifact, since it suited the large payloads and the email path reuses the same blobs, and we'd add scheduled pre-generation of common periods so the everyday statement is never composed on demand at all.
Related Case Studies
Questions about account statement performance.
- What response time is achievable for generating statements over a large database?
- In this engagement, about 95% of statement requests returned within the client's five-second target. Every request runs in the background against that budget; anything over it — typically a full PDF or Excel export across a large or full-history range — keeps generating and is delivered to the user by email as a downloadable file, so the page never hangs.
- Why cache statements in Azure Blob Storage instead of Redis?
- A composed statement section is a relatively large object, and Redis is optimised for small values. These payloads were large enough that Blob Storage was the better fit — and the same blobs serve directly as the PDF and Excel downloads, so the cache and the deliverable are one and the same.
- How do you generate a statement whose data spans several databases and services?
- The statement is composed by reading the core Azure SQL database in parallel with the other microservices. The database's stored procedures read the main tables and, through a SQL linked server, additional data in the linked databases — so one SQL call spans both. That call and the API calls are awaited concurrently, so a section is only as slow as its slowest source.
- Why is a full PDF or Excel export slower than viewing a single tab?
- A single on-screen tab reads only its own section, so it stays within the five-second target almost regardless of the date range. A full report contains every section, so an uncached export reads all of them — though sections already cached from on-screen browsing are reused, and any still missing are queried in parallel.
Slow Reports Frustrating Your Users?
The fastest way to find out if we can fix it is a two-week Discovery Sprint — a fixed-price diagnosis of your backend performance, no long-term commitment required.