• Beau Steward

In Response To Stupid Developers Version 1.1.1.Over 9000

You know, I understand the complaint in the article. It's a stupid problem and it's probably easily solvable. But the article speeds right past the actual real problem. First, the article:


Yes, there is an engineering problem. And it's possible the engineering group is a bit inept. But this support person who also does operations doesn't get to escape some finger pointing, here.

It's in the beginning. The exchange points out the manager is completely unaware there is a production problem because the problem is masked by support. The manager even asks where the tickets are, and the support person says that's irrelevant.

That's 100% relevant. If you're not communicating the problem to other teams, those other teams are 100% unaware there's a fucking problem.

One of the companies we acquired some time ago had a support team that was given database access and access to other production things they really shouldn't have. They solved problems in production that customers brought up on their own. Sometimes engineering would learn of these problems and solve them. Often, though, they did not. One of those problems came to light when a support person needed to updated 2 rows in a table of millions of rows, and ended up updating all of them instead. This created an outage.

The DBA team was not aware they had this access because it was a legacy thing that escaped access audits. Engineering was also unaware that these frequent changes were occurring. The DBA team demanded all further production access required going through the DBA team. Then the complaints and pushing back started. When a customer required something done, it was expected to be completed within the hour, which is why support just did the changes and moved on. Going through the DBA team would likely lengthen the response time.

But it would also help avoid mistakes.

Additionally, by having the DBA handle these things, when something because recurring, there were more eyes on the problem. A repeated task needs a tool built for it. The result of this meeting was a ban on support having direct access to the database, and engineering had some new tooling to build for support.

Had support communicated that this was a thing going on early on, we could have avoided an outage caused by a bad data update. This was a preventable problem that occurred because communication was not made to other teams that this was a problem.

Back to this article, though. The manager asks where the tickets are and the support system says that's not important. Why should the manager invest any time and resources to a problem that's not important when there are other measurable priorities?

1 view0 comments

Recent Posts

See All

Drop+THX Panda Still Has Problems

I just trashed a previous article I wrote but hadn't published. I was prepared to write about the Panda being a failure because, once again, firmware updates to solve its problems were delayed. But...