Lessons from production.
Field notes from real systems — debugging stories, architecture trade-offs, anti-patterns we've paid for in outages. No tutorials, no hot takes. Just what actually held at 3 AM.
Anonymity as an architectural invariant: why "we won't contact you" isn't enough
Promises like "you will never be contacted unsolicited" aren't enforceable on B2B platforms — vendors route around them offline. What works is a technical invariant: contact data lives in a separated sphere, the server only releases it on a mutual match. A property of the code, not a TOS clause.
Static export instead of a webshop: when the shop is physical, the cart is the problem
A boutique with a physical store doesn't need a webshop — it needs a conversion path into the store. We dropped cart, checkout and inventory sync — and shipped a faster, cheaper, more robust product.
Schema.org LocalBusiness: the underrated lever for regional SERPs
A small tradesman site doesn't rank by backlinks — it ranks by structured data signals. Schema.org LocalBusiness with geo coordinates, `areaServed` and `serviceArea` is a direct lever for local queries — more effective than any meta tag, because Google reads it as fact, not claim.
Symptom ≠ Root Cause: How the auto-healer became the real problem
A PostgreSQL primary at 91 % CPU. The auto-healer kills the noisiest query. An hour later: 91 % again. The lesson: quick fixes can lock themselves into an infinite loop if nobody asks which pattern is actually repeating.
Cache without a lock is thundering herd: 14 endpoints, 8 workers, one dead database
Cache expiry is the one moment when parallel workers all get expensive at the same time. Without a per-key lock, every refresh dumps your full load down the slowest path.
Expression index ignored: Why COALESCE in the index didn't match the ORDER BY — 29 500× speedup
A functional index on COALESCE(column, 0) had zero effect. The planner ignored it because the ORDER BY used a subtly different expression. Lesson: expression identity is not a suggestion, it's a precondition.
UPDATE with subquery and LIMIT: When the daemon spins in place
A simple UPDATE pattern that looks correct on small datasets and silently stagnates in production. The cause: a filter in the wrong place kills forward progress.
Commit before async I/O: how a single enricher idled the entire PgBouncer pool
A transaction waiting on an HTTP reply is invisible in the connection pool — but it holds the slot. With twelve parallel daemons that's enough to push a whole backend to 502.
Batch finalisation per container: why the monitor showed nothing for 83 minutes
A pipeline with N parallel sub-jobs finalises its status at the batch level — every worker is running, but the monitor reports standstill until the last container finishes. The fix: finalise per container, not per batch.
Behind every lesson sits a real Case.
Take a look at the systems these insights came out of — or bring your own problem straight to the table.