Tim Mitchell
Follow Tim Mitchell on Twitter  Like Tim Mitchell on Facebook  Subscribe on YouTube  Connect on LinkedIn  Subscribe to the Data Geek Newsletter

Do You Really Need Real-Time?

fastIt wasn’t so long ago that the first day of the month was the most common trigger event for updating key metrics. Indicators such as profit, efficiency, bonuses owed, and other markers would be published monthly after that month’s data was tabulated (which may be days or even weeks into the new month). In some organizations, the work required to calculate these metrics took the entire following month to complete, making the process of preparing month-end data a never-ending cycle. In such cases, the freshness of the reporting data was limited to a monthly basis because of the amount of work involved to crunch the data.

Fortunately, automation and better data handling tools have eased much of that burden. These days, the process of making data available on a more frequent basis leans much more on technology and less on manual work. As a result, it is more common to find reporting data current as of yesterday than relying on days- or weeks-old data to make business decisions. Aided by automated business rule processing and data validation, the effort to make available reporting data from the prior day usually justifies the costs of doing so.

Do You Really Need Real-Time?

With the new de facto standard of having reporting data current as of a day ago, it makes sense to ask the question, “Can we get it faster?” After all, the pipeline to process data multiple times per day would have a lot of the same plumbing already built for the daily load process. However, loading data in less-than-daily intervals is rarely as simple as just increasing the frequency of the load process. When the loading of reporting or analytical data becomes an ongoing process rather than a nightly batch, issues such as OLTP data contention, accuracy in change detection logic, overlapping data loads, transactional consistency, and constant load monitoring require more attention than a single daily batch load during off-peak times. As such, the ETL logic and supporting processes used for real-time or near-real-time reporting and analytics will usually look different than that used by a daily load.

Moving to real-time or near-real-time reporting and analytics can be worth the investment. If part of your analytics workflow uses data from earlier today to make decisions today, then it may be worth exploring the total cost (up-front development as well as ongoing maintenance and monitoring) versus the business value. Be sure to do the cost-benefit analysis to make sure that business value is there. Just because you can load reporting data more frequently doesn’t mean you should.

About the Author

Tim Mitchell
Tim Mitchell is a business intelligence and SSIS consultant who specializes in getting rid of data pain points. Need help with data warehousing, ETL, reporting, or SSIS training? Contact Tim here: TimMitchell.net/contact

2 Comments on "Do You Really Need Real-Time?"

  1. Moving closer to real time has always interested me. The only client I worked with that wanted near real time stock position but did not need it as they were still on weekly stock ordering. Sometimes organisations push for something they perceive they need.
    If they had analytics hooked up to near real time and had a low lead time on stock then it would have been awesome.
    I guess people have become used to expecting everything now, so intra-hour updates rather than daily will just become the norm

    • In some cases, real-time reporting and analytics are required for the business to run, and for those, it’s certainly worth the extra investment. But those are the exception rather than the rule, in my experience.

Leave a Reply

%d bloggers like this: