The Numbers Don’t Lie… Except When They Do

Numbers LieThere are few things more reassuring for a data professional than having clean, consistent data to back up critical business decisions.  The numbers don’t lie, or so they say.  But can the right data lead to wrong conclusions?  Sadly, yes, and I suspect that it happens more often than we’d like to admit.

Recently, as part of a large hospital project I’ve been working on, I’ve been addressing questions around the all-important census, the magic number of patients bedded in a given facility.  This facility converted from an outdated software package to a more modern, SQL-Server based product about a year ago, and one of the key goals with the new system was to “get the census right”.  See, the old system had at least a few dozen census reports, most of which were in disagreement with the others about the true count of in-house patients.  Because the software package was dated, the most common complaint was that “the system” was reporting incorrect information.  However, during my review of the archived reports, it quickly became clear that the reports from the old system were all correct.  The malfunction was not in the answers, but in the questions.

Speaking Different Languages
The root of these types of problems is consistently identifying the performance metrics of a business.  In our census example, one report might include only admitted patients, while another could include those still in triage in the ER.  One report shows the count of inpatients as of the previous midnight, while another provides the same information in real time.  In isolation, each of these metrics is correct, but when held side-by-side with the others, it appears that the output is wrong.  This is, in my experience, a trend on the rise as the movement toward self-service reporting continues to grow: more and more end users are querying their information systems than ever before, and many of them are basing critical decisions on loosely defined standards and definitions.

To avoid these assumptions and half-truths with data, I offer the following preferred practices:

  • Clearly define the metrics, entities, and standards that are critical to your business intelligence, and share them with all principals.  Often this involves answering what appear to be silly questions: “What is a day?”, “What is a patient?”, “What is a sale?”, “What is a billable hour?”, etc.  By clarifying these elemental questions, your downstream metrics will be improved because everyone understands what is and is not included in these definitions.  Once those terms are defined, be dogmatic in their use.
  • Involve all business units in those standards-setting conversations.  Regardless of your industry, you should include principals from each major facet of your business – sales, marketing, decision support, executives, customer service – to ensure not only a comprehensive understanding of the reporting needs but to also create a feeling of ownership in the process.  If they believe in it, they’ll support it.
  • Ask leading questions.  Don’t simply give everyone ad-hoc access to your raw data; use the technical tools available to enable managed self service reporting (security controls, data warehousing, denormalizing views, or more encompassing tools such as PowerPivot) to limit the open-endedness of most user queries.  The data should be versatile enough to answer the important business questions without creating a free-for-all where the results could be made to show almost anything.
  • Validate your output.  There should be a formal, mandatory validation process for each new output created, whether it’s a simple report or an entire volume of data.  Having a validation process that crosses business units is highly desirable, as this lends itself to more accurate and versatile (reusable) reports.  Part of this review process should be to confirm that the data retrieved is not already provided by existing outputs.

With the growing trend toward self-service reporting, it’s more important for everyone in a business to keep on the same page.  It’s still going to be a challenge, but by following these few practices, this process should be a little easier.

About the Author

Tim Mitchell
Tim Mitchell is a data architect and consultant who specializes in getting rid of data pain points. Need help with data warehousing, ETL, reporting, or training? If so, contact Tim for a no-obligation 30-minute chat.

Be the first to comment on "The Numbers Don’t Lie… Except When They Do"

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.