Production bugs and on-call support



  • Are there any metrics to determine if a "software defect" that happens in a Production environment is really a bug/defect - and then it's worth engaging 2nd level or "on-call" support?

    In particular, I'm searching for literature about this.

    Thanks in advance.



  • It depends.

    I know this sounds frivolous, but there really is no standard for when a customer-reported issue should be escalated to a bug.

    Some of the things it depends on are:

    • How well-trained the first-line support people are. Really good first-line support who know the application and can guide a customer/user to provide reproduction steps and track down most of the misconfiguration or misunderstanding issues mean almost everything they escalate is an actual software issue.
    • Whether the problem needs a programming fix or not. Sometimes in the case of corrupted data it's necessary to create a small utility application to clean things up. Here it generally doesn't matter whether the problem was caused by an actual bug or by a customer incorrectly terminating a long-running process: they just want their data corrected and a wise organization will do this for them (usually with a strong warning about terminating processes in spite of the big flashing warnings signs telling them not to do this).
    • Whether the customer support engineers can fix the problem without involving programming. Depending on your organization, support engineers can do things like create database scripts to correct a problem.
    • Whether the problem description matches something the test team has already reported but was not fixed for some reason. It's quite common for issues reported by the test team to be set aside for later if the perceived impact or risk is sufficiently low. Sometimes this assessment is incorrect because nobody knew there were customers using that peculiar combination of configuration flags (My personal record on the "nobody would ever do that" stakes is two days between releasing something with a problem "nobody would ever do" and a customer doing exactly that).
    • The impact of the problem on the customer. If the problem impacts financial data or data required for regulatory purposes, the tendency will be to escalate immediately due to the high risk/impact involved.
    • Which customer is reporting the problem. Support staff know who the biggest customers are and will be more likely to escalate problems reported by large customers, customers with a reputation for phoning the CEO to complain about the software, and so forth.

Log in to reply
 

Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2