Wednesday, March 15, 2017

What is your problem? - Part 3: Descriptions

Months ago, I wrote about problem reporting within teams, making a general distinction between good problem reporting that leads to a solution versus insufficient reporting that causes the involved parties to lose precious time during a critical situation.

Now it is turn to look at these from the perspective of software development, which may turn off audiences interested in the general topic, at which point I incur in the blunder of assuming anyone besides my immediate colleagues read these.

Technical notes, your problem is not someone else’s problem…

There are always those moments in software development where overall quality assurance process fails our customers and our standards, at which point we must publish a technical note about the problem. For the project where I based this posting, the template of a technical note required 6 fields:
  1. problem description. A general view of of the problem. This is a very difficult topic for most developers who have not been exposed to the problem reporting techniques covered in this series, in that general is confused with imprecise. This topic is therefore the focus for this posting.
  2. symptom. List of externally observable behaviors and facts about the system upon occurrence of the problem
  3. cause. List of internal and external triggers for the problem, with special emphasis on those that can be triggered (and hopefully fixed) by the customer versus those that are internal to the product and require a product fix.
  4. affected environments. Complete list of prerequisite software and hardware where the problem can be observed, including versions and releases.
  5. problem diagnosis. Symptoms and causes give a good indication as to whether the problem matches what a customer is seeing, however, the customer needs certainty before moving on to the next field.

  6. problem resolution. The ultimate cause for a customer ever reading through a technical note, how can the problem be either fixed or worked around. A common problem in our internal reviews was that original drafts incurred in the mortal sin of limiting themselves to listing the upcoming release where the problem would be fixed. The customer always expects an interim solution to the problem, even if imperfect.
…so how do I know what is your problem?

To paraphrase one particularly troubled internal draft, we had the symptom, cause and description all rolled into a problem description field like this:
“search for records may be incomplete due to a [private] database being corrupted upon execution of a [series of commands]” .

At that point, we applied the criteria outlined in the previous posting to determine whether the problem reporting to the customer would lead to a decision or to confusion:
  • What is the expected behavior from the product?
The description can be somewhat ‘reversed’ and allow one to infer that search for records should not be incomplete. However, this inference indicates what the product should not do instead of what it should do. For the technical types, this kind of wording tends to make the author look sloppy at best, disingenuous at worst.
  • What is the observed behavior in the product?
The description alludes to incomplete results, but results can be incomplete in so many ways, such as not containing all the records that would match the search criteria, or containing all records while missing some fields in each record.
  • Does the reported problem happen to all units of the product?
  • Does the reported problem affect the entire product or just portions of it? If so, describe the portions?
The ‘product’ here is the operation executed by the user. Is it all searches that are affected or only certain searches?
  • Does the reported problem happen in all locations where the product is used? (this forces the problem owner to have actually sampled the problem in all locations where the product is used) .
Locations can be read as systems. If the product can run on multiple operating systems and depend on various versions of middleware, is there a complete list of systems where the problem occurs? Is it all of them?
  • Does the reported problem happen in combination with other problems?
This particular point would not apply to the original problem description as the problem happened independently of other problems, as a function of search parameters and system operations preceding the searches.
  • When did the problem start? If you don’t know, make it clear you don’t known and state when you first observed it
When reporting the potential problem to a customer, the starting date would translate to the release number where the problem would be first observed.
  • What is the frequency? Continuous, cyclic, or random?
The problem description was reasonably clear about the problem being continuous. At least in my opinion, continuous can be assumed whenever considerations about cyclic or random occurrences are not explicit. In other words, I would consider poor form for those types of frequencies to be omitted.
  • Is the problem specific to a phase in the product life-cycle?
The problem description was reasonably clear about the sequence of operations that would lead to the problem, indicating the problem to affect the system runtime phase versus planning, installation, configuration, or any other.
  • Is the symptom stable or worsening?
The problem description did not mention increasing degradation of results, but it is worth asking that question during a review process prior to publication of the technical note.

From problems to satisfied customers

This is an area to be approached with energy and patience while coaching people who are new to any field in the industry. Describing problems as a function of language and critical-thinking is not an exact science and requires prolonged periods of practice and feedback to be mastered.

When someone without the proper training in problem description encounters someone on the other side who will go out of their way to understand the problem, it is easy to mistake the positive interaction rooted in an act of kindness for the most efficient way of going about it. And whereas acts of kindness are still a core value in the workplace, on any given day we would rather have that kind person interacting with more people rather than spend it all on a single person working without proper training.

Once you have put the right effort behind training people in this topic (or training yourself), you will have started a transformation effect on people that transcends the topic and the workspace: people used to asking all the right questions and solving the right problems under any circumstance.