At a recent Data Disruptors Meetup, Looker CEO Frank Bien delivered an opening presentation about how the data landscape had changed over the past decade
Three eras of data
Bien outlined three eras in data; the first in which querying data was very slow and expensive, and therefore it wasn’t done that often and only really to provide headline financial metrics. While these numbers were highly accurate, they were slow to produce and didn’t really provide much insight into what was driving performance.
The second era; when querying large data sets is relatively easy and cheap from a hardware perspective, but due to lack of good business intelligence tools, can only be done by people who know SQL. Employees without these skills are unable to dive deep to answer questions they have and so are reliant on data analysts/engineers to provide them custom metrics they need. So, they put in a request and a few days later get the number they asked for. But that number doesn’t tell the whole story and yields further questions, so they put in another request which they have to wait even longer for and resort to other tactics to get the answers they need. Typically, this will involve using a set of numbers that they do have, and performing either calculations or considered estimates to arrive at what they believe is a correct answer. Often, however this combines numbers based off different definitions or business logic, and therefore leads to incorrect answers.
The third era, which Bien believes we haven’t fully reached yet, is one in which the platforms that sit on top of the data allow everyone to query the data and find the answers and insight that they are looking for.
At TotallyMoney, it feels like we have gone through exactly these three eras. Before I joined, there was The Daily Report which took about four hours to run. It was effectively an Excel spreadsheet hooked up to a Microsoft Access database, fed by csv files from the production database. It mostly provided headline numbers to the business, such as daily click and revenue numbers.
The business then moved to Amazon’s data warehouse service, Redshift, which allowed data to be read much more quickly. On top of that, we built our own reporting tool, Viper, allowing our users to select a report which ran a SQL query on Redshift to produce a csv file. As time has gone on, and the business has developed, many of the queries report on the same set of data, but at varying levels of granularity or aggregation, depending on the user’s need.
While this was a significant improvement on the previous version – queries take seconds to run, instead of hours and deliver more detail – it still suffers from the second-era problems described by Bien; Increasingly, the questions asked by the business can’t be answered by these reports and are currently being answered by data analysts or engineers.
Dawning of a new era
Over the past six months, we have taken several steps to move into Bien’s third era of data, the most significant of which is our move to replace our currently reporting suite with Looker (covered in detail in this post). Having heard at the same Meetup how Looker has significantly improved business intelligence at King, a company with roughly 50 times more staff than TotallyMoney and a database 1,000 times bigger than ours, I am extremely confident and excited about what our roll-out of Looker over the coming weeks will bring.