You don’t need us to tell you that the data world – and everything it touches, which is, like, everything – is changing rapidly. These trends are driving the opportunities that will fuel your career adventure over these next few years. At the heart of these trends is a massive wave of data being generated and collected by organizations worldwide.
With this data we can shift our focus as analysts from explaining the past to predicting the future. And in order to do this, we need to spend less time doing the same things over and over and more time doing brand new things. And accomplishing all these changes will require us to work together differently than we do now
1. Bigger, Larger and Faster Data
You’ve probably already heard the fact that every two years we, as humans, are doubling the amount of data in the world. This literally exponential growth of data is impacting analysis in some big ways:
- Big data means new infrastructure: distributed computing like Hadoop.
- Large datasets mean new tools. Excel can no longer do the work it once did. We’ve seen analysts using Access to cut datasets down into Excel digestible pieces.
- We are on the cusp of the real-time data revolution. Services like Kafka will enable organizations to apply their data products in real time, which will revolutionize everything from operations to customer service. The urgency of top-notch analytics will be paramount!
2. Predictive Analytics
The vast majority of time spent by the vast majority of today’s analysts is on understanding data collected in the past, often in the form of reports and dashboards. Those days are coming to an end. The data and tools now available are allowing analysts to go beyond just convincing someone to do something and instead to often just do it themselves. For example:
- Using customer data to identify which customers are most likely to churn (stop being subscribers/customers) – offer them special deals automatically in order to keep them.
- Using Internet of Things (IoT) data to identify which machines in a factory are most likely to break down, and fix them before they cause a disruption to production. This is called “predictive maintenance”, and not only does it reduce downtime, it can also substantially lower insurance rates.
- Using customer behavior data to narrow down potential fraud cases for insurance companies. As the predictive model gathers more data, it becomes even better at figuring out which cases the company should focus their investigative resources on.
3. Automation Of Tasks
Once upon a time, analysts built a model in Excel, and once a month or so, they exported the model to PowerPoint and send it to (or even printed it out for) the managers who relied on regular reports. Soon, there were too many reports, so maybe they used macros in Excel to automate the creation of reports. Or maybe they were lucky enough to have a dashboard program that had some automation functionalities built in. The future promises even more than this:
- Replicable data preparation flows/recipes that can be applied and customized easily and quickly to brand new sources of data and for brand new applications.
- Models scheduled to re-run regularly and produce a set of metrics that will determine whether or not they are performing as needed.
- Meta-reports: regular reports on the state of the many models deployed in production, so that analysts can feel comfortable and in control.