Data Science as a profit booster for SMBs (3/4): Data Science as a service

In this part of the series I would like to explain what does “data science as a service” mean. What are specific examples of data science solutions? What should you know when you decide you want to give it a try? What will typically be the requirements on you and your internal team? 

The post is especially aimed at applications of Data Science in e-shop environment.

The benefits of tailor-made solutionsDS_as_a_service

The great thing about purchasing “data science as a service” is that the service can and should be tailor-made, designed to suit the exact needs of your business. This is important for two main reasons. Firstly, it gives you an edge over competition (who does not have it) and secondly, because you only pay for features that you will use.

I bet that if you search your memory, you will remember a software that you/your company bought for quite a lot of money and actually only used a small part of its functionalities (the most prominent example might be the good old MS Office). In the case of tailor made solutions, you do not need to pay any extra for functionalities you will not use and those that you will pay for will be developed especially for you.

Most retail companies would tell you that they want to offer services that correspond to individual needs of their clients. But on the other hand, they are able to buy “one-size fits all” digital solutions. In my eyes, it does not make sense. Why would you deny your own company something that you provide for your customers?

When you want to give it a try…

To finally get to the point, some examples of data science solutions are:

  • Recommendation engines – models that propose suitable shopping baskets that are shown typically to customers of e-shops in order to promote further sales (e.g. as Amazon)
  • Propensity models – models that choose customers with the biggest chance of buying a specific product, ideal for designing targeted campaigns.
  • Dynamic pricing models – models setting optimal pricing of products that maximize profits, product turnover, etc.
  • Customer lifetime value models – models that predict how valuable will specific customers be for the company, useful to set individualized service models or product types

To illustrate what is the process behind building a business relevant data science solution, let’s take a look at a specific case – so called client churn model.

1. Defining the underlying business challenge

Let’s imagine a business, for whom it is profitable to establish a long-term relationship with their clients. This could typically be an e-shop or a retail bank. In our case, we will assume it is an e-shop. It might happen that the client base of such business starts to shrink or it does not grow as fast as the management wishes. One option how to approach this challenge is to try to prevent current customers from leaving. In this case, any retention activities are typically more successful when they are pro-active (i.e. before the situation occurs) rather than reactive. The data science solution corresponding to this specific situation is therefore to build a model that will identify clients that are likely to leave – a churn model – so that such clients can be e.g. targeted with some attractive marketing offer.

2. Translating the task into the data (and back)

We already know that we want to identify clients that are likely to leave. But what does it actually mean “to leave”? How can “the act of leaving” be represented in the data? It is important too bear in mind that models can only be built if we are able to represent the reality by some available data.

In the case of e-shops, we often represent leaving by a period of inactivity of certain length. But how long should this period be? There are of course statistical methods to answer this question but quite typically there exists some internal, business-related rule. Let’s assume that a client is considered as inactive when he/she did not make a purchase for 4 months.

The task (as translated into data) therefore looks as follows:

“Based on available data predict, which customers are likely not to make a purchase in the next 4 months.”

3. Before we can build the model…

There are some key prerequisites for building a model that can be implemented into company’s processes meaning that it actually brings some tangible business value.

The four spheres are:

Internal experts

You will need an internal domain expert and a data administrator to cooperate with the data scientist.

People –  The same as in many other areas of business, joining the forces of various subject experts is crucial for successful construction and deployment of a model. Firstly, a domain expert such as product manager – or in the case of the churn model a marketing expert – is needed to provide business logic, rules and best-practices to make sure that the solution will be relevant. Secondly there is the data expert, who knows the internal database and is able to quickly assist with data preparation. And thirdly the data scientist, who is the one to translate the business need into the data and back.


Data – what data are available? which of them are relevant? In-house data experts in cooperation with data scientists are able to answer these questions. The discussion usually takes place over a data sample from the company’s internal database. In the case of our churn model transactional data (who bought what and when) and possibly also socio-demographic data of are useful.

Modelling tools – which software to use for modelling? For general modelling tasks, various software programs are available. In the case that the company already owns some licensed software its use is obvious. If purchasing any licences is out of question, open-source programs can be used (e.g R or Python).

Infrastructure – how to transfer the data from database to modelling software? How to get the model results actionable? There is a broad range of technical solutions of the above mentioned. Various data warehouse and data integration solutions (my favorite one being Keboola) as well as visualization and BI tools. The most appropriate is chosen based on infrastructure already implemented in the company. For models that identify clients to be approached by marketing campaign such as the churn model, it is ideal when the model can be implemented already in the marketing tools or their source database.  I.e. the marketing representative has access to the list of target clients and can launch the campaign by “click of a button.”


Phases of a Data Science project + How to deal with the developed solution once you have it in-house?



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.