In the last few years the obvious fact that for successful marketing you need to “contact the right customers with the right offer through the right channel at the right time” has become something of a mantra. While there is nothing to disagree here, it is a pity that for most part the saying stays in words and only gets realized in rare cases. The issue is that while many can repeat the mantra, only few actually know what is needed to put it in practice. In this post, I am going to talk about the first part – how to target the right customers for your marketing actions?
There are many approaches to solving this great puzzle. One of the extreme solutions is having a team of marketing experts who rely solely on their gut feeling, projecting their opinions on customers, without any proof, not even evaluating or testing the campaigns. Because that’s what they did in their previous job. It might sound ridiculous in today’s digital era, but surprisingly it is often the case.
The other extreme is building complex AI engines and let them make all the decisions. This is typically a proposition by some geeky start-up run by fresh PhD holders. This approach is in my opinion also wrong. First, you have absolutely no assurance that the data available truly reflect the reality, that the algorithm works flawlessly or simply that the randomness in the world is not too strong to predict. After all, even companies running algorithmic trading have human dealers overseeing their algorithms, who focus on addressing weaknesses of the algorithms and generally on preventing internal disasters.
As always, I think that the solution lies somewhere in between. An experienced marketer, whose opinion is backed by information extracted from the data available, can truly hit it. Imagine that you have to run a campaign to increase sales of a saving account (or a road bike, new robot, a holiday in Caribbean…). The long proven data extraction technique one should consider is called propensity to buy (or to purchase or to use).
What this propensity model does is simply studying features of customers and looking for those clearly differentiating buyers from non-buyers. Or in other words, generalizing traits of historical buyers. These findings are then applied on the future behavior of your customers. Each customer can be assigned a propensity to buy score, which represents a probability that the customer belongs to a group of buyers. In plain English: how similar is the customer to the historical buyers. You should however bear in mind that that propensity to buy does not equal probability to buy. A customer might share all the traits of buyers, but the product might not be good enough for him, he might get better offer somewhere else, the timing is just not right or he has already bought the product.
For demonstration, let’s consider following (extreme and simplified) example. Imagine we have three customers with high, medium and low propensity to buy for example a high-end road bike. What we can say about them? The high-propensity customer has all traits of the typical buyers. He’s an avid cyclist who has already bought some nice road bikes in the past. It’s inevitable that he’ll buy a new one, the question is only when and which one – the one you’re offering? The low-propensity customer is not even distantly similar to the past buyers. She might be for example a 85 year old lady. There is almost zero chance that she would respond to even a very intriguing offer of your bike. The medium-propensity customer is a person that shares traits both with past buyers and non-buyers. For example a 40 year old man who uses his mountain bike for family trips. There is no reason why he should not buy the bike but also no clear indication that he would.
This example should also point out that you should resist the temptation to contact low-propensity customers because you want to win them and you assume that the high-propensity customers will buy the product anyway. They will, sooner or later. But probably not from you. And addressing low propensity customers is just a waste of money.
When I have the propensity, how does it help me with the marketing you ask? The main use is to increase ROI of the marketing campaigns. Firstly, the propensity model enables you to contact people, for whom your campaign has higher chance of success and therefore means increased revenue from the campaign. To get a better idea of this relationship, look at the plot below (click on the image to open an interactive version).
We expect a higher response rate for customers with a higher propensity to buy (hence the response rate line decreases with an increasing target group size). The response rate is however lower than propensity (not shown directly in the plot). In our simulation we assume it’s 10 times lower. This leads to decreasing marginal gains of the revenue of the campaign (convex shape of the cumulative revenue line), because it is dependent on the propensity to buy. We assume that the positive response of a customer to the campaign brings 3 USD. If the campaign costs are flat for all customers (in our example 10 cents per contacted customer) we can easily find the sweet spot for the size of the target group as well as choose the target customers bringing the highest possible profit of our campaign.
Secondly, propensity can be used to differentiate the offer with consequences for the costs. In the example above the costs were the same for each customer. But imagine that you want to give each customer only the minimal offer needed to win them. Such offer will then logically be more advantageous for customers with medium propensity than for those with high one, leading to significant savings on the company’s side.
For complete understanding, let’s now spend a while describing how does the model look like. Technically it’s a binary classification task. There are many tools in a basic machine learning toolkit to solve such task. As always, there is a trade-off between accuracy of the model and interpretability. A simple logistic regression might give reasonable results and at the same time it offers a clear explanation why is the propensity high or low for some client. More sophisticated models do not offer this option, but the accuracy is typically better. Take for example neural networks or tree-based models. The ultimate group of extremely complicated models full of stacking and blending typically used for real advanced stuff like image recognition or Kaggle competitions can be also used for propensity modelling, but tracking results of these models back to the actual characteristics of customers is almost impossible.
In order to build the propensity model, you obviously need some data about historical purchases and some features describing the customers at some point in time before the purchase. After all you want to know, who the people who will buy your product are and that can’t be just guessed out of thin air. These features can be anything from basic social demographics to product ownership, transactional history to data from social networks, free text or geographical and online behavior.
Building such a propensity model does not take long. It’s typically ready in a few days. The best approach then is to ask the internal marketing expert to design a pilot campaign using the propensity scores next to the standard one. Then you can simply carry on an A/B test to see if the new approach is better and by how much. There shouldn’t be any complicated discussions about which campaign, what data, how does the current campaign work and so on. As I said, we are talking about few days of work not a substantial investment.
Once you have the propensity models in place you can start thinking about leveraging other analytics in the marketing. You could create a micro-segmentation to properly understand your customers and be able to design the right products and craft the right offers. You can consider so called uplift models that will enable you to target only the customers that would not buy the product without your campaign. You can also consider models to find the right time for your offer – be it a time-to-event models or automated event detection.
In any case, the information provided by the propensity model should always be an integral part of the decision making process.