A few decades ago, I lost marks on a college assignment1. The brief called for an evaluation section: what went wrong, what I learned, what I’d do differently. My submission said, essentially, “it all went fine.” I’d been doing this stuff in my spare time, and even semi-professionally, for years at this point. There wasn’t a struggle to document in the way there would have been if it was all new.
The marking scheme wanted more of a narrative arc, and I turned in a bit of a non-event.
I still think about that assignment every time I see my team complete a big deployment or migration that goes well.
The best deployments are completely unnoticed. No errors, no downtime, no panicked Teams messages. Just… business as usual. Tuesday becomes Wednesday and nobody knows you moved the entire database to a new provider. This is what good looks like. And it’s sometimes near-impossible to get credit for.
The deployment that gets discussed at the all-hands meeting is the one that went wrong. There’s a story there: the alert that fired at 2am, the rollback that didn’t work, the heroic debugging session, the RCA with its numbered lessons. It’s an exciting story, it has characters and tension and a resolution. Unfortunately, that means people remember it.
The flawless migration has no story. “We switched DNS providers last month” is barely an anecdote.
But the work was real. The weeks of preparation, the rollback plans tested against production-like data, the feature flags ready to disable the new path, the load testing proving the new system could handle the traffic. The careful sequencing so that every step was reversible. The monitoring dashboards watched by people who hoped they’d stay “boring”. It all ends up producing what looks like a non-event – which is exactly what we want. It just doesn’t look as impressive as round the clock firefighting.
How do you write a post-mortem for “nothing happened”? What goes in the retrospective? “We planned it well, then we did it, then it worked” can sound a little silly, even when the work involved was substantial.
There’s a version of tech culture that celebrates firefighting. The person who spends every day for years fixing yet another outage is visible. The person who spent three months making the system resilient enough that there’s no outage is not.
Visibility isn’t the same as recognition. I once spent several evenings finishing a project to avoid missing a client deadline – work that was necessary because I’d spotted the risk early enough to do something about it. When I overslept and arrived late one morning, my manager pulled me aside. Nobody had seen the late nights or the near-miss. “They just see you slacking in the morning when they’re on time.”
So: the work that prevents the problem is invisible (a story only works when it has an audience). The work that delivers it quietly is barely more visible. But the one morning you’re late? Everyone notices.
I don’t have a solution to this in general. But within my own teams I try to make a point of it – calling out the deployment that went flawlessly, flagging when we deliver ahead of schedule and without incident, naming the preparation that made it look easy. It doesn’t fix the wider visibility problem, but it at least means the people doing the quiet work know it’s been seen.
The best deployments are the ones nobody notices. The best migrations are the ones with no war stories. The most competent work often leaves no trace.
Just like my college assignment all those years ago – teams are still getting “marked down” for not having enough problems.
- I’m totally over it, I say, as I sit here blogging about it two and a half decades later… ↩︎