Data science is far from being a new field. Yet, for quite a substantial portion of mostly smaller businesses, the term data science stays obscure and difficult to grasp. Many of them keep hearing about it, but they are not really sure if or how they should use it. In my practice, I have come to the conclusion that it is above all them, the SMBs, for whom data science represents a great opportunity.
This is the first of blog posts, in which I would like to share my thoughts on what is the value added of data science as well as how businesses can make use of it.
Question #1 – What the heck actually is Data Science
The term “data science” is considered just as a new buzzword by many people, all the more so that understanding what it means ideally requires at least some technical awareness. Not that it is necessary to understand data science to be able to make use of it, but unfortunately people tend not to like things they don’t understand.
When I need to explain what data science is, I use the very general saying that ‘data science is about extracting knowledge and insights from large amounts of data’. Sure, it can seem that this is nothing new – after all, analyzing data in Excel has been possible since 1985. But let’s not do the same mistake as many famous consultancy companies by believing that applying business intuition over Excel and PowerPoint visualizations is data-driven approach that can give anyone an edge over competition in the 21st century.
Data science is much more than that. It enables us not just to describe and form an impression, but to get verified conclusions based on data in various forms and locations and seamlessly present business-relevant results via lucid, easy-to-distribute, scalable data visualizations and automated products.
The extreme power of data science solutions then in my opinion lies especially in their ability to predict future outcomes – the famous “from descriptive to predictive” – and to get the results to the end-user on an automated basis.
…and how can it even be important for businesses?
To many people this might still sound as a technical definition that is too detached from day-to-day reality. And they are in fact right – the crucial feature that can make data science relevant for business is when it is performance oriented. It has to have a clear relationship to profit or any other KPI measure, otherwise it becomes only a scientific exercise with no value added whatsoever (and let’s be honest here – it is often the case).
A model is only useful, when it looks for an answer to a specific underlying business question: ‘How do we target the right customers to increase our sales?’, ‘How do we stop our customers from leaving us? ‘ or ‘How do we become more relevant for our end-customers?’ are the most typical examples of simple yet necessary questions to be asked. Data scientist is not a person, who explores data and finds some possibly interesting insights from time to time. They are more like doctors, who treat patient’s (company’s) illnesses given a clearly stated diagnosis.
When to seek a Data Scientist?
I will elaborate on the metaphor a little further: Data scientist are like doctors. But they are not the ones who straighten broken bones or perform heart surgeries. When the company’s heart is not beating right, when its limbs are not moving, data science is not the best tool to use.
My (simplified) idea about business performance improvement is that there are basically three stages depending on how far in the process the business is; each of the stages has different means to solving business problems:
- Securing vital functions: to use a simple and a little bit extreme example, when a production company is not able to produce their product, no data science will help. There are specific, rather physical changes that need to be undertaken first.
- Collecting and visualizing data: when the company is already producing, but with quite a poor efficiency a simple data collection and BI visualization can offer enough information to achieve tangible improvements.
- Data Science: even though the company operates well now, it might happen that it is still not enough to overcome the competition or achieve growth required by investors. This is when data science comes to the rescue.
The time when a business can utilize data science the most is when they need to be better than their competition, better than they were yesterday. No surprise that data science is the most successfully applied in industries and markets with high level of competition and pressure on margins.
It is in fact the level of competition that determines how accurate models need to be. And this is important to bear in mind as accuracy of predictions is typically one of the factors that decide the costs associated with model development.
Sometimes it is necessary to point out that business world is not a kaggle competition. In many situations 80% accuracy for $10 000 is just enough and there is no need to go for 90% costing $100 000.
In the next posts: Why is data science difficult to grasp? Why is it easier to implement for SMBs? How does an implementation of a data science solution typically look like?