Over Architected?

Over Architected?

First off, I am by no means an “Agile” (nor even “agile” with a small “a”) software development expert.  There are many more individuals out there that are way more experienced and in a much better position to speak authoritatively about the Agile methodology and its associated principles on driving efficient and effective software development.  A few blogs where I consistently find great Agile perspectives are included at the end of this article.  But as I’ve participated in Agile and agile-like development efforts over the years, I’m finding an interesting pattern.  Agile and agile-like approaches have a positive by product of reducing the occurrence of over architect-ed software solutions. Over architect-ed solutions put stress on the delivery of a software application project as well as drive up the cost of software development and maintenance, in general, disproportionately to the business value produced.

As an example, a sample development effort starts out with:

Product: “We need a super widget in the product by next release, can we have that?”

Project: “We are going to need detailed requirements for the super widget in order to start developing it.”

Product: “Oh, it needs to be able to interface with the dry cleaners to know when it is time to pick up the laundry as well as make coffee for the customer before they get out of bed in the morning.  Basically the same features as our competitor’s recent release, but with these additional benefits. <Or some other description that is actionable at a high level, but lacks the detailed requirements needed to feed a development team with actionable development tasks>”

Development: “Ok, we’ll get started since we don’t have much time before the code freeze for the next release.”

… Time goes by …

A project status meeting occurs sometime in the future.

Product: “So, where are we with that super widget?”

Development: “We have the basic framework setup but it isn’t going to have all the features working by the next release.”

Product: “But I thought …”

Development: “But you said …”

Project: “What happened?  How come we are X days from the code freeze and we don’t have a viable solution?  I thought …”

Now sure, this isn’t the most perfect example of capturing how the requirements drift between stakeholders leads into an over architect-ed solution, but hopefully you get the scenario.  Or possibly another example would be when a solution is developed and released into the production environment.  Extending the example above, weeks later, when enhancements, tweaks or feature extensions are requested, a tense conversation occurs around:

Product: “But I thought the super widget would do X?  How come I am hearing it will take 30 hours of development to get X?  The testing cycle is already elongated due to the complexity of the super widget thus I thought it included X?”

Development: “But we said that the framework would support X, but we never said X would actually function without more development!  We developed the super widget to do W, Y and Z but only stubbed out X.”

Project: “But according to the requirements, it says X should be …”

And yes, an argument can be made that if:

  • more effective requirements gathering occurred
  • more effective project management captures more depth of what would be developed and available when
  • more effective product management defines a more exhaustive product feature road map that more clearly outlines what would be available and not available when, feature-wise

… These problems wouldn’t have occurred.  But the nature of an agile-like approach puts a tighter focus on all the stakeholders:

Product: They can share the “overall vision” of what they ultimately desire the product to do, but they are forced to consider what they really need within the shorter duration of the agile-like release schedule.  Thus, product walks away with a clearer picture of what they are really getting in the next release.

Development: They get the benefit of the product’s “overall vision”, yet, they get to quickly dive into the critical features and start the dialog of how long different feature components will take to develop.  Thus, development knows exactly what they need to do now for the next release, yet they benefit from knowing where this product feature is going in the future.

Project: As long as they keep the product and development stakeholders talking about granularly defining what needs to really be built by when, the project management function has much greater clarity into what is going on and what details to track.

From what I am observing, all of the above create a stakeholder forum of information sharing that reduces the likelihood that an over architect-ed application will get developed. Most importantly, instead of leaving the feature set open and vague enough to allow a creative and motivated development team to start building and building and building only to re-surface with a highly complex solution to a loosely defined problem or need, it brings more cohesion between what is really needed first.  Once the “first” has been built, the “seconds” and “thirds” get built inline with the product roadmap.

In researching this theory, I wasn’t able to find any articles that linked agile-like development efforts to a direct reduction in software over architect-ing.  This article entitled “Agile Architecture: Strategies for Scaling Agile Development” had some interesting content on baking architecture into an agile-like effort.

Anyone else have any direct experience in agile-like compared to waterfall-like development efforts yielding less application over architecture?  Can anyone share any links to good web articles on this topic?

Agile related blogs I follow:

David’s Software Development Survival Guide

http://softwaresurvival.blogspot.com/

NOOP.NL

http://www.noop.nl/

Software Project Management

http://blog.brodzinski.com/

fragile

http://fragile.org.uk/

Regular Geek

http://regulargeek.com/

Critical Results

http://criticalresults.com/

, , , , , ,

IT Engineering Spectrum

IT Engineering Spectrum

Someone noticed, when reviewing my work experience on LinkedIn, that over my tenure, I’ve had the opportunity to manage both IT infrastructure teams and IT software development teams.  They posed the question:  What are the differences between managing the two different teams? This question got me thinking.  People that focus on infrastructure compared to software development do represent potentially opposite sides of the IT engineering spectrum.

With that question in mind, recently, I had the chance to interview Jim Shelton, Executive Recruiter; Information Technology for MRI North Canton to get his perspective on the differences he sees in MidWest IT managers looking to hire infrastructure versus software development resources.  Given his 11 years in sourcing IT talent for MidWestern companies, I thought his perspective would be valuable and no surprise, it definitely was.  Below is an excerpt from that recent interview:

QUESTION: Jim, can you share a brief bit more about your role and history in recruiting IT talent for MidWestern companies?

ANSWER: I sure can.  I work for Management Recruiters – North Canton, Inc.  I started recruiting in the IT space back in April of 1999.  I have truly found the thing that I really love.  You see my background is that of an IT professional.  I have a Computer Science degree and started my career as a software developer back in the late 80’s.  I believe this foundation along with my desire to help people improve their lives and realize their dreams is what contributes to my success in recruiting.  Because of my experience, I have been recognized several times within and outside of the MRI organization for effectively solving my clients staffing problems.  The accomplishment that I am most proud of is that over 70% of the people I have helped to find permanent positions are still with the companies where they were hired.  This compares with the industry average of less than 40%.  This is the major reason I enjoy a better than 50% rate of repeat business from the companies I consider my clients.

QUESTION: So, when a hiring manager approaches you for a talent search, do you approach developing a search plan for infrastructure resources differently than software development resources?

ANSWER: Not at all.  The search process boils down to having relationships (people you know), initiating conversations, sharing information, and asking if they have any suggestions.  It’s simple networking.  This could be described as the basics of executive recruiting.  Where things differ is having the knowledge and skill to be able to effectively target the type of individual that the hiring manager is seeking to locate.  The best performing recruiters are able to ask the right questions to gather the correct information that allows them to effectively discover the pieces of information that is going to help them hone in on the “right fit” person for the position being described.  A common fallacy is that a job description does an effective job of conveying the type of person that a client is seeking.  A simple example is this.  You say that you want some M&M’s.  I go to the store and I buy plain chocolate M&M’s.  I give them to you and figure my job is done.  You say this isn’t what I wanted.  OK, what did you want then?  You said M&M’s and these are M&M’s aren’t they?  You say, I guess that’s right but I wanted the Peanut M&M’s.  So, I go to the store and I buy Peanut M&M’s.  I give them to you and figure my job is done.  You say this isn’t what I wanted.  I think you can begin to see my point.

So, the whole process begins with a thorough initial interview with the hiring manager and then we develop the search plan.  All search plans will include utilizing direct phone calls to targeted individuals, e-mails to both targeted and more general contacts, e-mails and phone calls to contacts gathered from internet resources, posting to internet job boards and social media sites, name gathering activities targeting companies with similarities to the search requirements, etc…

To summarize, the way you develop a search plan doesn’t differ.  The way you approach the resources doesn’t differ either.

QUESTION: Obviously the technical skills needed are quite different between the two, but what commonly exists across both roles?  I am thinking there maybe some common soft skills … what are your thoughts?

ANSWER: Another common fallacy is that recruiters are always looking for the “A” players.  I commonly describe these professionals as having the full package, excellent technical skills, excellent soft skills, and superb business acumen.  The reality is that depending on the need of the client could determine if Superman is appropriate or if “Clark Kent” will do!  Once again, it all comes down to the defined need.

The soft skills that are generally important are communication skills, both verbal and written, and general business knowledge (this is what I refer to when I say business acumen).  Another common characteristic that differentiates technical talent is industry experience.  Companies love to hire people that already have a general sense about their business category.

Some more specific “soft skills” would be things like leadership, tact (political awareness), personality, critical thinking skills, and attention to details.

QUESTION: In your experience in interviewing full time or contract individuals looking for work, do you find the two IT disciplines have common non-technical candidate traits or do you find a definite split between infrastructure candidates and software development candidates?

ANSWER: It’s bad practice to generalize things with a broad brush because there will always be exceptions.  So, I don’t think there are common traits in either group that would differentiate them, one from another.  You know they are all people.  I’ve always said people are people.

QUESTION: Have you ever been asked to source a senior candidate that can perform both roles equally well?  Do you find those individuals with deep skills in both areas easy or hard to find in the market?

ANSWER: It’s pretty common to think that in small to mid-size organizations that companies would ask me to locate an IT manager that can manage the entire IT function both the software development side of things as well as the infrastructure.  It is true that I have found it is a unique person that has grown up technically with expertise in both software development and infrastructure.  So, most IT Managers in these size of companies manage best the area where they are most experienced.  It is not uncommon for a company to have high turnover within one of the two disciplines simply because the area that doesn’t perceive their value to be equal to the other will have employees that don’t feel appreciated and valued the same as the other.  Even though this is the case, I am not generally turned to surface these type of managers because I charge a fee for my service and companies that are looking for a “specialist” are more inclined to use my service than a company that believes they really only need a generalist.  This is a mistake that is made because the leaders of the company interpret someone that has background in both IT disciplines is a generalist and not a specialist in any one area.  For purposes of this discussion, if a company was looking for a leader of their IT function and they wanted that person to be equally experienced in leading both software development and infrastructure functions they will have a difficult time in locating that person because it is most common that an IT leader has grown up technically on one side of the house or the other.

I do think it is an uncommon skill set and these type of individuals are difficult to locate.

QUESTION: Based on your experience, do you have any candidate coaching tips as far as how to position their resume and how to present themselves for an infrastructure position to a Midwest company in the current job market?

ANSWER: My advice in this market is not different than in other markets.  The key to finding positions is having relationships that you can network with to help you locate openings.  Experience tells me that a person referred by someone that has a connection with the decision maker, however remote that might be, is more likely to receive an interview than one that has no connection at all.  I know this won’t sit well with those that are unemployed right now but, the best time to position yourself for your next position is to be involved in the community of like skilled professionals.  The relationships that you develop while you are employed are the ones that you will connect with when you are looking to make a move or unemployed.  If you work in a large company, get to know as many people as possible within the company, stay in touch when they leave your company, you may need them in the future.  Also, join organizations, offline and online communities, volunteer, etc…  It really doesn’t take that much time it just takes effort.

Sure you want to use job boards and other sources of advertised jobs but don’t limit your search efforts to these.  If you want to get into a company, use your resourcefulness to try to locate connections that you might have that might know someone in the company and ask them to hand deliver your resume or forward from their e-mail to the proper person hiring.  They don’t have to endorse you, remember your just trying to get the opportunity to interview for the opening.  After securing the interview, then you will have to do the rest!

QUESTION: Same as the previous question but for software development candidates?

ANSWER: Answer is the same.

QUESTION: Do you have any advice for candidates that are currently in house and are looking to become a contractor or vice versa in today’s Midwest IT job market?

ANSWER: My experience tells me that most independent contractors became contractors due to their current employer making the decision to outsource the job they were currently doing and the management approached them to become a contract employee instead of an in-house employee.  This is the ideal way to become a contract employee.  Other than that, becoming a contractor requires a special person that not only enjoys doing their IT jobs but, also, enjoys the challenge of locating the next assignment.  I would say that the grass is not always greener on the other side.

At this point in time, rates for most contractors are at an all-time low.  There is less and less contract work and more and more contractors that are available for the work.  If you remember your economics classes, I think you can understand what I am saying.

QUESTION: Lastly, for both infrastructure and software development positions in Midwest IT companies, what are the most in demand skills for each that you are being asked to source?  Is there a noticeable trend for more contract or more in house positions for each?

ANSWER: I’m the type of person that always tries to answer questions directly and this question frustrates me because I have to answer it with a somewhat vague response.  I am definitely seeing more demand for contributor (non-manager) level positions than anything else.  The mix of demand for infrastructure resources vs. software developers is about the same.  But, since I don’t really specialize in any one technology, the technology mix is all over the place. VB.Net, Java/JEE, citrix, vmware, AIX, C#, SQL SVR, t-SQL, DB2, MCSE, MCSD, CISSP, Oracle Forms, PL/SQL, Windows, AS400 (iSeries), etc…

Across the board, the trend is leaning more toward contractor vs. perm.  As a whole, companies continue to be cautious about adding headcount.

This has been a very difficult 24 mos.

Jim, thanks much for taking the time to sit down with me and chat about what you see in the MidWest IT market for both infrastructure and software development positions.  As always, appreciate your experience and wisdom in this area.

Look for Part 2 to dive into the aspects of managing these two different teams.  Plus I’ll argue that a third team also exists within the IT engineering team spectrum to be posted soon.

, , , , , , , , ,

Know when to call in some help during an outage

Know when to call in some help during an outage

Anyone that works in the Information Technology field knows that production technology systems, from time to time, will have problems. From a functional defect that has everyone scratching their heads as to how it wasn’t discovered by seemingly endless rounds of QA to full blown hardware failures that take down entire suites of applications, no matter how much is invested in “highly available” and “redundant” technologies, failures are bound to occur. For IT Managers and IT Engineers, how one handles these failures from inception through service restoration and finally root cause analysis is critical. Sure, the priority is to restore full service availability as soon as possible. But, if you neglect some key technical support quality attributes in the process, which I’ll highlight in this series of articles, you may find you both succeeded and failed in restoring service at the same time. Succeeded and failed at the same time you wonder? Please read on and I will attempt to shed some light on this success with failure construct and considerations on how to avoid the failure “pitfalls”.

Pitfall = Challenges in an Extended Outage

So, you’ve bought into the need to be response based on a previous article touting the benefits to you (being viewed as a leader and raise and bonus positives) and your organization (calmly restore production IT services to normal working order). You’ve communicated in a personal style with incremental positive facts and indicated at what timing points you will be updating the stakeholders on your progress as indicated in the previous article. If the problem can be easily identified and corrected quickly including a rather direct way to explain why it happened, pat yourself on the back for a job well done. Now get ready for the after math of re-explaining what happened a hand full of times over and possibly participate in some post issue shoring up of the technology (see root cause analysis considerations posted here previously). But what happens when the status reporting is going on longer and longer and you can tell that the natives are getting restless as they are starting to grow concerned at the length of the outage and at the lack of a clear “it will be fixed in 5 minutes” status report? When an outage becomes an extended outage, time to ratchet up the communication plan and bring in some help.

Problem isn’t Obviously Fixable in Short Order? Get Help

Most likely, as time is going by, more people are aware of the outage and thus the list of stakeholders is growing larger. Also, the likelihood those stakeholders are senior technical people offering to give you a hand is slim and none … and slim left town as the saying goes. I would venture to say that the stakeholders are a growing list of non-technical people that are impacted in some way by the production situation continuing to be a problem. More and more managers on the operations and product side of the service are getting engaged as possible customer complaints are mounting or call center call volumes are reaching levels of concern. There maybe more people engaged to discuss what to do if the outage continues and an alternative, possibly more manual means is needed to meet customer SLAs. By the way, manual usually means more work done by people, hence more people getting engaged to see if they have to bring in even more people to ensure the alternative service delivery option has the right, skilled and trained staff. Company marketing resources could be engaged to offer advice on how best to let customers know the service is having a greater than normal duration outage and what the company plans to do to service their needs. I am not trying to paint a picture of doom and gloom for the primarily technical audience for this article. I know the technical mind wants to have all the people just stop talking so the real work of fixing the technology can take place. But on the business side of the technology in trouble, there are company stakeholders and customers of some form or another that are materially impacted in some way by having the usually highly reliable technology fail to function correctly.

Thus, as time goes by, your incrementally positive but not “it’s fixed” communications aren’t enough to appease the masses. You are either going to have to spend more and more time explaining to new people joining the situation what happened when, what has been ruled out, what is next to investigate, etc. or risk becoming non-communicative in order get some focused time to fix things, thus putting all your hard work at risk as outlined in this previous article. It is time to ask for some help.

Hopefully you have already engaged your management to keep them apprised of the situation as suggested in this previous series of articles. Thus, you may already be getting asked if you need help because you have informed your management and thus they are starting to ask the “hey, you are doing a good job, but can we help?” type of questions.

Ask for and accept help

I can’t stress it enough: avoid the notion that the fix is “just around the corner and if I only spend 10 more minutes researching …”. Ask for and accept help. To start, get someone engaged to be the status communicator so you have less distractions and more time to dig into the problem. The status communicator needs to have level of competence in the following skill areas:

  1. Enough of a technical background to take technical status bits from you and quickly understand what you are saying without a 5 hour white-board deep dive session.

  2. Ability to communicate in “business speak” not “techno-speak”.

  3. Enough understanding of the players involved organizational chart-wise to know how and when to communicate with stakeholders and when to recognize the VP of Product is looking for status and it is time to get your VP peer manager involved.

Your manager is in the best position to act in this capacity if they aren’t already doing so. As managers, you stand to lose huge management credibility and leadership points of you just sit on the sidelines and hope the problem goes away or you are somehow hoping for plausible deny-ability to relieve you of your responsibility in this situation. Roll up your sleeves and get engaged. Start sharing what is going on in a polite but authoritative tone to build confidence and most importantly, buy more time for your engineers to dig in and figure out what is going wrong and fix it.  This previous series of articles offers additional tips.

In summary, as the outage is dragging on, be mindful that not everyone involved has the priority of discovering the coveted technical root cause. For engineers, as an extended outage is building, don’t keep trying to take on the rolls of technical investigator and communications expert. Get help. Managers, get involved and start shielding your engineers from the constant barrage of status requests and allow them more focused attention on digging in and finding out what is really going on and get it fixed.

We’ve extended the need for responsiveness to reports of production support problems to include an initial take on the art of creating an effective status communication approach as well as when to admit your need help and get your manager and/or team lead involved directly. Look for additional articles to identify more technical support pitfalls and steps to take to avoid them.

, , , , , , , ,

Respond and forget, right?

Respond and forget, right?

Anyone that works in the Information Technology field knows that production technology systems, from time to time, will have problems.  From a functional defect that has everyone scratching their heads as to how it wasn’t discovered by seemingly endless rounds of QA to full blown hardware failures that take down entire suites of applications, no matter how much is invested in “highly available” and “redundant” technologies, failures are bound to occur.  For IT Managers and IT Engineers, how one handles these failures from inception through service restoration and finally root cause analysis is critical.  Sure, the priority is to restore full service availability as soon as possible.  But, if you neglect some key technical support quality attributes in the process, which I’ll highlight in this series of articles, you may find you both succeeded and failed in restoring service at the same time.  Succeeded and failed at the same time you wonder?  Please read on and I will attempt to shed some light on this success with failure construct and considerations on how to avoid the failure “pitfalls”.

Pitfall = Providing Status

So, you’ve bought into the need to be responsive based on the previous article touting the benefits to you (being viewed as a leader and raise and bonus positives which are always good) and to your organization (calmly restoring production IT services to normal working order).  So, all you have to do is “respond” by sending an email right away, jumping on a conference line quickly or changing a status in a production trouble ticket tracking system promptly and you are done, right?  You can now disappear into the depths of your logs files and your performance counters and your packet traces only to resurface when you have found the real cause of the problem, right?  Never under estimate the extent to which people, lacking timely information people, will panic.

To help illustrate, we can extend the example from the responsiveness article of needing a plumber to call you back quickly to address the hot water heater that is pouring water all over your basement floor and not delivering any hot water to any faucet in your house.  Consider that a plumber does call you back promptly to indicate they are able to start looking into your leaking hot water tank right away.  But after that responsive call back, time keeps ticking by without any indication if your tank can be fixed or needs to be replaced or is about to explode and flood your basement in the process.

Note: Yes, you can walk down into the basement to physically see the plumber’s progress or lack there of, but pretend you can’t easily do that to allow this extended plumbing example to help frame the context for this article.  Let’s say you left your home for work right after you confirmed the plumber was engaged to fix your problem.

So, without any further status from the plumber besides his or her initial: “Yes, I look into your hot water tank problem right away”, how do you know what is going on?  The plumber could be minutes away from turning off the water main to stop the river forming in your basement followed quickly by unloading a delivery truck approaching your house with a brand new hot water heater or sitting down on the couch to catch a baseball game on TV completely ignoring your water dilemma.  Thus, how do you know what is going on?  You don’t, unless you are physically watching the plumber’s every move or the plumber is providing frequent status as to what is going on with your hot water crisis.

Frequently provide status

So, how does one keep the panic to a minimum once initially responding to the production issue?  Reduce panic by frequently communicating status of what is going on in the troubleshooting process.  This sounds simple enough, just keep everyone informed:

  • “I just VPNed into the network”
  • “I am pulling up a terminal session with the server now.”
  • “I am typing my user name.”
  • “I am typing my password.”
  • “Ooops, wrong password, trying again.”
  • “I am now at a command shell …”

Obviously, that is going too far into the over communication side of the status equation.  What you are trying to find is the artful balance in the level of detail and frequency to share status.  As in all things technological, there is no silver bullet, no industry established check list and no “do this and it will work for every situation written on a stone tablet somewhere to implement with guaranteed success.  One has to put some energy into looking for clues as to what is going to work best in the given situation and then constantly monitor the results of the your communication approach to tweak as necessary.

But this sure seems like a lot of work that doesn’t get directly at fixing the true technical problem?  Correct.  As I mentioned previously, you can dedicate all your efforts to fixing the problem as quickly as possible, but be prepared for the consequences of various negative backlashes surrounding non-technical and peer management’s frustration of being left in the dark for who knows how long starting from problem occurrence and ending at problem resolution.  Plus, you can safely anticipate the root cause analysis aftermath being painful and extended due to this lack of communication frustration you have helped create.  Thus, I am arguing the time invested up front in an effective communication approach will pay large dividends in avoiding post service restoration negativity and an elongated investment in root cause analysis malaise.

Art of an Effective Status Communication Approach

So how does one determine a successful status communication approach?  First, suspend your technical or engineering brain that puts speedy problem resolution as the highest priority in any production outage situation.  Recall that once you put aside the technology, people are involved in the production outage.  Harkin back to the plumbing crisis example above, if you are at work wondering how much your water bill is going to be as your basement floods, what would be your reaction to getting call or a note from your plumbing saying:

“Hey, this is Bob the plumber, just wanted to let you know I stopped the geyser erupting in your basement.  A replacement water tank is on a delivery truck and should be arriving at your house within the hour.  I’ll let you know when it gets here and what the next steps are in about an hour or so.”

Imagine the feeling of relief at getting such an update at work.  Now, carry those feelings of relief over to the other people involved in the production outage situation.  They are fretting over lost revenues or having to explain to their management what happened, why and what is going to be put in place so it never happens again with absolutely no clue at this moment on answers to any of those questions at the moment.

Can you make everyone relax and go about their day with a smile with a few simple sentences on what is going on?  Not a chance, but you can help keep the people involved more calm and less likely to break out in irrationality by providing indications of where you are in the troubleshooting process.

Consider this revision to the step by step over communication example from above:

“Everyone, this is Bob from systems support.  I was able to get online and successfully access the production server that is hosting the application that is involved in the production outage.  This is a good sign in that we able to start debugging immediately without any infrastructure barriers at this point.  I will now start investigating the error logs that should give some further technical direction on what is going on.  I will let everyone know what I discover in 15 minutes from now.”

Similar to the status update from your plumber, there are key elements in this status message that address the human side of the outage:

  1. Saying your name

Saying your name seems over simplistic, but giving your name instead of hiding behind the anonymity of an artificial company group such as “systems support” makes a small but important personal connection to all of the people involved that possess likelihood to panic at a moments notice.  This is similar logic as to why people prefer talking to a human rather than interacting with an automated “push or say 1 and then entering your 45 digit account number” system when calling to resolve an incorrect cell phone, gas or electric bill.

  1. Providing legitimate positive news, even if it is somewhat insignificant to correcting the real problem

Again, seems simplistic, but by indicating you were able to get online and get into some level of technology to begin troubleshooting, it helps to give additional confidence to the non-technical individuals participating in the outage that some potential barriers to real problem resolution have been crossed.  Look for opportunities to share facts that narrow the problem down, even if they only narrow the problem down ever so slightly.  The increased feeling of progress that the elements of narrowing down the problem create help to continue to enforce feelings of increasing control over a seemingly out of control situation to the non-technical people involved.  Again, you are looking for balance.  “I successfully typed my password” does no invoke that much confidence.  Thus look for real progress facts that can be shared that focus on narrowing the problem scope rather than just facts for the sake of facts.  Lastly, I chose the word “facts” specifically.  Make sure you communicate facts and not speculation at this early problem engagement level.  I’ll cover some suggestions on how to share speculation in another article.

  1. Indicate when the next status communication will occur

Giving people an indication of when they can anticipate an update on what is going on or what you are doing provides two significant benefits.  The first is it allows everyone participating in the outage who is not directly involved in restoring service the ability to relax just a bit and prepare for the when they need to be engaged next.  They know there is nothing tactically they can really do to solve the immediate problem.  They know they are effectively 100% dependant on technical resources to do the real work of finding the problem and fixing it.  They desperately want to hear: “the problem is X and I’ve fixed it.”  But since you nor anyone else is at that point in the troubleshooting process, a time in the not too distant future where such a phrase might be uttered is the next best thing.

The second is it gives you much needed breathing room.  Instead of hearing “Is it fixed yet? How about now?  Now?  Maybe now?” every couple of minutes, you’ve clearly set the expectation that you need some uninterrupted time to do some digging in order to provide anything valuable as far as investigative analysis.  Thus, you now have some time to completely disengage from the noise associated with the problem and roll up your sleeves and immerse yourself in performance and log data to try and figure out what is going wrong with the technology.

Communicating Status – Approach in Summary

  1. Use your name and thus communicate in a more personal tone to increase confidence in non-technical participants … avoiding the opposite completely impersonal tone of “tech resource number 12”
  2. Provide positive news to further increase confidence and reduce the panic building in others with facts (not opinions), even if those facts are small troubleshooting milestones and not grandiose “ah ha!” findings.  Make sure to balance the too small “I pressed enter and …” type facts.
  3. Indicate you need time to dig deeper and set the timing expectations of when others can await the next element of status from you to buy uninterrupted investigation time and allow others to put off panicking for a period of time.

We’ve extended the need for responsiveness to reports of production support problems to include an initial take on the art of creating an effective status communication approach.  Look for additional articles to identify more technical support pitfalls and steps to take to avoid them.

, , , , , , ,