Real life testing of dynamic pricing model in e-commerce

It’s no secret that a good pricing strategy is one of the most important aspects of every business. There are various ways how to determine prices at which you can maximize your overall profit. However, to maximize profit and engage price-sensitive customers at the same time, you have to make sure you go in the right direction.

Dynamic pricing, based on real-time market changes, is the latest pricing trend that dominates the e-commerce industry. Before deploying our dynamic pricing solution, we faced a problem how to test the performance of the model and compare it to the current client’s solution.

As you may agree to adopt any pricing model without a thorough testing would be extremely risky (especially, when it has a control over thousands of products!).

The question was, how to design an appropriate test that helps us validate a new pricing strategy and gain confidence in the change we were making.

Standard A/B Testing…

The basic idea behind A/B testing is to compare two variants A (the currently used control version) and B (the modified test version). Customers are typically split in half at random, while the two groups need to be as similar as possible. Without being told, the customers in both groups are assigned to either a control group or a test group. The goal is to determine which variant performs better.

There can be confusion about how to approach an A/B test. For instance, what is the right metric, how to split customers, how long does it take to achieve statistical significance (two weeks or two months?)…

Despite all the things that might go wrong, this approach works great for a lot of cases. Unfortunately, it does more damage than good when it comes to the pricing strategy.

…and why not to use it

If you want to get the unquestionable results of the test, you can’t test one variation today and the other one tomorrow. A/B Test needs to be run simultaneously in order to avoid differences caused by variations in timing.

In the case of testing a pricing strategy, you would offer the same product at a different price to different customers at the same time. This kind of price discrimination is an ethical grey area that leaves your customers out of temper. In a few months, you’re almost guaranteed to deal with a much bigger problem – called customer churn. 

As we’ve experienced while planning the test for our e-commerce client, instead of splitting customers into two groups it’s reasonable to consider two distinct groups of products. Just like in a standard A/B test, the products in the two samples need to be as similar as possible (especially!) with respect to the key evaluation metric.

In our case, we took several dimensions into account such as total net profit, total units sold or product type (e.g. mobile phones) in order to create sufficiently similar groups.

Clarify what you are looking for

Distribution of the evaluation metric for the two groups.

The first thing to do when planning any test is to have a clear idea of what you are looking for, in other words, what is the key metric.

We decided to monitor a daily aggregation of net profit for all the products in our sample. This variable follows a normal distribution which allows us to apply a two-sample location t-test and calculate the corresponding confidence intervals.

Before the test, we verified (using scipy.stats.ttest_ind function for the means of two independent samples) that the means of the key metric of the two samples are equal, which indicates we had chosen similar products in our testing samples. After the test (i.e. after applying our dynamic pricing model to one of the samples), we wanted to show that the means of the samples are significantly different, and thus the baseline model and the new model have been performing differently.

The means and the corresponding confidence intervals BEFORE the test.

The means and the corresponding confidence intervals AFTER the test.

Plan the length of the experiment

Pricing strategy testing is not an overnight thing. Depending on the number of products you get, you might want to run a test during anywhere from a few days to a couple of months. You have to be aware that taking insufficient time can skew the results.

In our case, the length of the experiment was given in advance based on the confidence interval stability. Daily net profit seems to be stable approximately after two months for the given samples.

Last but not least remember to keep testing regularly, since the performance of any predictive model can change over time.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.