Big Data Analytics Vs. The Gut Check

Editor’s note: Steven Hillion is co-founder of Alpine Data Labs where he leads development of an enterprise platform for advanced analytics. Before joining Alpine, he has managed teams of data scientists and engineers at companies such as Siebel and Greenplum. 

Data is more varied and fast-moving than ever, and analyzing it effectively now requires highly sophisticated software and machinery. But where does big data analytics leave the good-old-fashioned hunch? What if the data tells a business manager to “jump” but her intuition says “stay”?

It might sound surprising coming from me – I’m a math and technology guy after all – but I strongly believe that intuition steeped in both data and business savvy must steer analytics in order to generate real value.

There’s an attitude that says you just have to apply enough math and machine power to a dataset to achieve the best models. But it’s foolish to assume that number-crunching alone can provide answers a business needs to get ahead. In data science, intuition and analytics work together in tandem, each informing the other.

First, intuition guides analytics. Analytics insights rarely appear out of thin air. They’re the result of the application of numerical methods to test hypotheses and ideas that arise from intuition and observation. And intuition also guides the methods that the researcher uses to test these hypotheses. Which data is relevant? Which variables and transformations make sense? What are the likely relationships between cause and effect? Which models are appropriate?

Second, analytics informs intuition. Unsupervised modeling techniques can discern relationships and patterns in the data that wouldn’t be obvious from a superficial view or a human-sized sample of the data. In short, analytics can suggest avenues for exploration that wouldn’t be picked up by observation and might even be counter-intuitive.

Without wise pilots on both the data and business sides directing the data analytics process and balancing gut instinct based on professional experience and knowledge, problems are bound to occur.

Let me give a couple of examples that come to mind.

A consumer banking team wanted us to create a churn model to help the bank predict which customers were likely to cancel their accounts. The data generated a dim picture. It turned out that when it came to savings, loans and credit cards, there were no clear triggers to reveal when a customer was about to jump ship. Spending and paying patterns remained largely the same until after the decision had been made and new accounts created.

However, as the bankers examined the data more closely, reviewing a set of customer segments that the team had created, an analyst used her intuition to suddenly notice a valuable new insight. She realized that a certain customer cohort showing unusually high-value loans, long-term customer value, and several other unusual factors, might predominantly belong to small business owners. A review of the individual accounts confirmed her suspicion.

She guessed that those business owners disguised as regular customers hadn’t realized there might be a better way of funding their business than by using a credit card or regular loan accounts. The project goal shifted to identifying these high-value customers and offering them more appropriate products. The banking team then went a step further and asked the data to identify product recommendations to other customer cohorts based on historical user behavior. The data enabled them to start to offer customers tailored products that would increase lifetime value.

It’s simply very unlikely that the data alone could have provided that key insight. This sort of business insight coupled with analytics is priceless. (Well, actually our analytics did compute the price and profit of the product recommendations, but you get my gist.)

Given the blood-and-bones importance of intuition in data analytics, it’s a wonder why so often the business side is left out of the process until the very end. Instead, business analysts should be invited to the process early on to collaborate. I’ve changed processes to bring the whole team into initial model reviews and, even earlier, into reviews of the raw data.

In another example, a client of ours, a large beverage company, wanted to predict future sales in Japan. We built a model that looked at how sales would react to different market and pricing pressures in the coming year. The client told us they thought sales were impacted directly by the economy. If Japan’s economy was slowly coming back, they thought, spending on soft drinks would increase.

They asked that we use the Nikkei as a kind of trend variable within our model. The index improved the accuracy of the model at first — or so it seemed. But then over the year the model started making wild predictions. The economy had started to bounce back but now the Nikkei was outside the range of the training data, and the original model was probably “overfit.”

A more experienced modeler would probably have resisted introducing the variable at all. There are times when intuition makes sense, but here, data science expertise suggests caution and an awareness of the limitations and pitfalls of the modeling process. In this case, we introduced a transformation to damp the effect of the stock market index, and the models went on to perform very well in guiding the development of a new marketing plan and predicting its effects.

There’s often tension in the air between data scientists and the business – particularly when the data seems to contradict the gut, and the effect of the splashy new campaign seems to be negligible. Often we’re left sitting across the table with the marketer asking ‘Where did that number come from?’ and the data scientist on the defensive.

But I believe this battle of brains is positive. Math and science should be able to stand up to questioning. Sometimes this results in data disproving intuition. Other times, those gut feelings based on deep experience can find flaws in the process. Ideally, everyone benefits.