Latency problems are sneaky. They rarely announce themselves — they accumulate quietly until someone notices the p95 response time has crept from 120ms to 340ms.
We profiled three weeks of production traces and found 60% of time was spent in sequential database calls that could have been parallelised. Classic N+1 query problem dressed up in middleware.
The fix was conceptually simple: identify all independent data fetches in each request lifecycle and run them with Promise.all. The hard part was untangling implicit dependencies that assumed a specific execution order.
// Before
const user = await getUser(id);
const prefs = await getPrefs(user.id);
const feed = await getFeed(user.id);
// After
const [user, prefs, feed] = await Promise.all([
getUser(id),
getPrefs(id),
getFeed(id),
]);
P50 dropped from 210ms to 90ms. P95 from 340ms to 140ms. Zero breaking changes to the public API surface.