Anyone that has had to participate in a meeting to determine why some IT system went down is echoing a collective groan as they read this title. For both IT managers and engineers alike, it is the least desired activity following a system failure of any kind. Business and/or product owners outside of IT are waiting, after the dust settles and the system is restored to working condition, to have primarily two questions answered:
- Why did the system go down in the first place?
- What is IT going to do to make sure this doesn’t happen again?
In the first article, I outlined the business context of the root cause analysis exercise in general and the complexities in clearly and logically arriving at a true root cause for a system outage due to the interconnected players involved. In the previous article, I outline a particular IT engineering resource approach entitled “Surprised and Confused” to participating in the root cause analysis process. This article introduces “Openly Be the Hero”:
IT Engineering Participatory Approach B = Openly Be the Hero
“I know what happened, the temporary storage volume …..”
This approach, which is diametrically opposed to the surprised and confused approach, comes with some different risks. By standing up and sharing every technical fact you can get your hands on to point out what really is going on can back fire in exactly the opposite way as the surprised and confused option. People will tend to latch on to the one spouting off all the undeniable facts and suddenly the masses will associate the one with all of the answers as the one being in a position to have avoided the problem all together. As far as your management goes, if they aren’t on board, you’ve placed them in a difficult spot to be supportive if the tide turns towards the root cause being the hero’s perceived lack of involvement. Your peers, fearing their job might be in some jeopardy, will most likely slink down in their chairs to remain quiet and allow you to stand tall to take the proverbial daggers of blame.
Now if you are one that has put in the extra energy to understand how the system or systems were constructed, the “why” behind the seemingly architecturally backwards ways certain business processes are completed you may struggle with avoiding the hero trap. You may be thinking: “The facts that I possess clearly indicate without compromise that what I known to be the root cause is the root cause. Why can’t everyone just go with the facts and be done with it?” Not everyone is comfortable accepting the facts even if they are the facts. What if the facts suggest a particular individual or group of individuals have been linked to the last five system outages? Maybe these five outages are legit and the individual or group is trying desperately to improve their system management activities. The last thing they need is another problem piled on top of their previous problems to further put pressure on management to take some action. In an effort to save their jobs and buy more time to get out from underneath their pile of problems they can redirect the masses to focus on the hero’s involvement and thus take the heat off themselves.
“Let me understand, the Hero knew that this problem was going to happen but didn’t do anything to stop it? Why is the Hero hiding knowledge that would help the company? This is yet another example of the Hero not sharing and not partnering. How can the Hero just sit idly by and allow this to happen. Something needs to be done about the Hero …”
And this “something that needs to be done” … and get ready, this is going to make any logical thinking IT engineer’s head spinning … could be as severe as disciplinary action cast upon the Hero. Why is such an illogical outcome such as the individual that amassed such valuable knowledge to be able to assemble together all the puzzle pieces of the problem become the victim of some disciplinary action? The answer falls more on the organizational hierarchy than on conventional logic. If the individual that is uttering those statements about the Hero is significantly high on the organizational chart, then the layers below, who have been focusing on all sorts of other fires, are caught without a good story as to why this situation occurred and why the Hero is not the root of all evil. Not being armed with a story that shields the Hero, the management layers in between are somewhat constrained and thus the blame lands on the Hero. For more of the management side of the Hero’s plight, see the articles that cover this in the management section.
Sure, your peers might find you after meetings and give you kudos for standing up for the facts, but is being technically “right” worth the cost of being put through this ancillary pain?
The next article introduces the hybrid approach which I’ve entitled “Play it Safe”