In a recent meeting with Big Data managers in the broadcasting industry, one complaint came over and over again. Data scientists have no feeling for business, and business people even less for statistics and Big Data. How to bridge the gap was the central question.
Correlation doesn’t mean causation
The lack of business understanding is especially visible when it comes to building predictive models. Too often data scientists will only look for correlations without checking that those correlations make sense (at this point you may want to have a look at this website which lists meaningless correlations).
Why bother would you say ? As a colleague told me, the principle behind Big Data is precisely to search for correlations that the business may not have thought of. From a pure technical viewpoint this may indeed be true. If correlations don’t hold, you’ll find it out when using the predictive model on newer data afterwards.
Big Data doesn’t explain causation
Unlike statisticians working 20 years ago, modern Big Data techniques don’t need a pre-existing model to fit the data. Actually models become the output of the computation and there is no problem having one model per subject studied. This is the miracle made possible by advances in computing power. Distributed computing enables to calculate large numbers of correlations for each subject in the sample.
As a consequence, modern data scientists are looking for correlations instead of explanations. Twenty years ago statistics were computed by scientists who build their models upon a sociological knowledge. We lost this sociological heritage and the consequences are in my opinion huge.
What are the consequences of not caring for causation ?
Not caring for causation has fundamental consequences for the business. Data scientists are pushed to produce several predictive models a day, for business people in search of segments to push their latest commercial offers. This process is anchored in a very-short-term perspective.
What leaders need to build is a strategy based on lasting trends and sound behavioral explanations that make it possible to innovate and fulfill the market needs. We need to bridge the rising gap between “data science” and business and this can only be done if a touch a sociology or qualitative marketing is added.