The only bug-free code is the code that doesn’t exist. As a software engineer, you read this and smile, but I know why. First, you know such code doesn’t solve many problems. Second, writing code still pays better than not writing it. As a side effect, we create bugs. And nothing’s more frustrating than the alerts when something blows up in production. On weekends. We’ve all been through this, and such debugging sessions brought lifetime memories for some of us.
Thanks for the mention, I'm glad you found the article on interviews useful!
I love postmortems after I wrote one for an issue I caused.
1. It's a document where the writing matters (at least in Amazon culture). This is definitely what started me in writing.
2. You are focusing on setting mechanisms in place to work better next time. It's not only about this occurrence of an issue but how to prevent, detect, and mitigate better in a future occurrence.
I think operations can't be just a training you do and that's it.
You wouldn't do just a self-service training to be a firefighter, or a doctor working in emergency rooms. Their training has to be through drills on how would they react to those emergencies.
I think we have a lot to improve on the training side to make sure everyone has the working principles you described: Visibility on your work, prioritize mitigation instead of root-causing...
Loved the broken commits printscreen, definitely guilty of that too 😂
I would flip the order, and start with prioritize. I think the biggest mistakes engineers do is to not consult with anyone before trying to fix bugs. I often saw cases where a hotfix caused a much bigger issue than the bug it solved.
Once you identify the scope of the problem, I would first talk to my team leader to prioritize.
Thanks for the mention, I'm glad you found the article on interviews useful!
I love postmortems after I wrote one for an issue I caused.
1. It's a document where the writing matters (at least in Amazon culture). This is definitely what started me in writing.
2. You are focusing on setting mechanisms in place to work better next time. It's not only about this occurrence of an issue but how to prevent, detect, and mitigate better in a future occurrence.
I think operations can't be just a training you do and that's it.
You wouldn't do just a self-service training to be a firefighter, or a doctor working in emergency rooms. Their training has to be through drills on how would they react to those emergencies.
I think we have a lot to improve on the training side to make sure everyone has the working principles you described: Visibility on your work, prioritize mitigation instead of root-causing...
Loved the broken commits printscreen, definitely guilty of that too 😂
I would flip the order, and start with prioritize. I think the biggest mistakes engineers do is to not consult with anyone before trying to fix bugs. I often saw cases where a hotfix caused a much bigger issue than the bug it solved.
Once you identify the scope of the problem, I would first talk to my team leader to prioritize.