There’s more to A/B testing then A and B: I

One of the most powerful features of mobile games is the ability to run simultaneous randomized experiments at no cost. Academics swoon at such a possibility and it’s very real and very spectacular in F2P games.  Decades of running experiments in academic research can lend insight to developer scientists. An example of this is an insight from experimental economics called ‘bending the payoff curve’.

One of the favorite topics of experimental economists concern risk aversion and auction theory, risk aversion due its ability to challenge the neoclassical paradigm (i.e. mainstream economics) and auction theory because it uses fancy mathematics. The first groundbreaking economic experiments employed auctions in lab settings to see if participants diverged from rational behavior. A series of experiments run by Cox, Smith, and Robinson1 appeared to show participants were not doing what we’d expect them to do if they were rational agents (i.e getting the lowest price). The suggestion being participants were acting as risk – averse agents, rather than risk – neutral agents. The key insight, however, came from a challenge in the way these experiments were run from a 1992 AER article called Theory and Misbehavior in First Price Auctions.2

The author, Glenn Harrison, argued that the costs of engaging in non-optimal behavior were incredibly small. In other words, being dumb didn’t cost participants that much, and being smart didn’t earn participants a great deal either. Glenn argued this casts doubt on the suggestion that participants were engaging in non-optimal behavior, but instead participants weighed the expected mental effort of being smart and concluded it wasn’t worth the foregone increase in income.

Each deviation from zero (the optimal bid) costs the participant little.

What researchers need to do, Glenn argued, was bend the payoff curve i.e. increase the reward for being smart. This way researchers can see if the type of behavior they’re testing is a real thing.

What does this mean for A/B testing in my game?

Developers often turn to A/B testing to test even the most minute items; frustration emerges when results are inconclusive. For example, Supercell might test whether or not players prefer reward scheme X or Y in Boom Beach by way of sessions played. An A/B test that presents each scheme after a battle could possibly turn up inconclusive. This is because each reward has a small outcome on player progression. That is itself an insight, but if we’re really interested in whether A or B is better it’d make sense to ‘bend the payoff curve’. That means we’d offer A + 5 or B + 5 to exaggerate the effects of the different reward schemes.

Think of it as amplifying two lights on each side of a room to see where flies gravitate toward. If the lights were dim the effect on the flies would be smaller than it otherwise is.

See? Just like an A/B test.
See? Just like an A/B test.

While not always appropriate, bending the payoff curve is another tool developer scientists should consider when designing experiments.