April 18, 2017
When Jesus said, “Let he among you who is without sin cast the first stone”, what he meant was, “Hey, nobody’s perfect. Well, except me of course”. And don’t we all know that this is true? Try as we might for perfection, we all make mistakes. Some are just a tad bigger than others. History is replete with examples of this sad fact of life. In 1788, the Austrian army attacked itself and lost 10,000 men; whoever did the soil analysis for the Leaning Tower of Pisa obviously made a few miscalculations; some executive at Decca Records decided the Beatles just weren’t “sellable”; and just how sorry was the Trojan who said, “Hey, there’s a big wooden horse out here. C’mon guys, help me haul this thing inside”? No wants these types of things to happen, but sometimes they just do. In acknowledging our all too human predilection for making some really big boo-boos, I think we should all offer our sympathy for the AWS engineer who fat-fingered a command and took down a few more servers than he intended the other day.
Think about it. The day probably started out like any other. Our culprit got up, showered, ate a little breakfast and then fought traffic all the way to work. We’ve all been there, right? Next, our soon to be infamous engineer stops by the break room, grabs a cup of coffee, since the line at Starbuck’s was too long, asks the random person there, “Did you see that Lakers game last night?”, gets to the workstation and sees a post-it note on the PC that says, “Pat, please de-bug the billing system for storage service S-3 today”. Because Pat is nothing if not responsive, Pat spends the next hour or so de-bugging some code, makes one itty-bitty typo, hits “send” and precedes to take down a bunch of the company’s most popular web services. Sure, Pat’s sorry, but this is the type of thing that you know is going to be brought up at review time.
I think there are a few lessons that we can draw from this unfortunate event. Certainly, always double-check your work is near the top of the list, and maybe “take your time”, but I think this episode demonstrates that we all walk a very fine line. Although we endeavor to do things correctly, or make the right choice, our margin of error is often only the width of a keyboard key or the length of a line of code. I believe that we inherently understand this fragility of action as illustrated by how fast our reactions go from “What a schmuck” to “Poor bastard” and ultimately, “Thank God that wasn’t me”. We will probably never know what happened to this anonymous unfortunate, although it is a safe bet that AWS’ VP of Software Engineering is not in the cards, but let it serve as a reminder to us all that, “To err is human: to forgive, divine”.