### Why do contestents break the rules in Netflix’s Too Hot To Handle?

Economists like Tyler Cowen or Brad DeLong are too self-respecting to study reality shows. Fret not, this economist has no such self-respect.

Previously, we examined the economics of the reality show genre but just as interesting are the economics of the a particular reality show’s design.

To Hot Too Handle introduces of the most interesting examinations of communal property dynamics: a group prize is reduced when individuals act in their short-term private interest.

At a more practical level, the show gathers ten attractive 20 somethings into a villa in Mexico for three weeks. Cameras are littered around the villa with the exception of bathrooms (to be replaced with mics). The contestants are only informed of the rules once the cameras start rolling; if they masturbate, kiss, or engage in any sort of sexual activity the prize pool of \$100,000 is reduced. It’s unclear to contestants how “expensive” each activity is or how the prize pool will be divided or won. Shockingly, interviews with the show’s producers reveal they didn’t have the rules or the costs figured out until they happened. While there’s no traditional contestant elimination process, producers will ask contestants to leave if they’re not invested in the “process”. Supposedly, the show wants to teach these singles how to form emotional rather then physical connections.

The spectacle for viewers is how hard it is for these contestants to keep in their pants – of the original \$100,000, over \$40,000 is lost. Seems like a lot, right? How could they give up so much money?

Well… It’s really not that much. On the face of it \$100,000/10 = \$10,000 per contestant. The tax situation matters greatly – U.S. contestants or those with residency in the U.S. will probably pay about 50% of that \$10,000 in taxes. Interestingly, if the show took place in the U.S. rather than Mexico, all contestants would be subject to U.S. taxes. It appears to the case that the Brits and Canadians don’t face game show taxes.

On a expected payout basis, the costs are far less then they might appear:

• \$3,000 for a kiss is only \$300 on a per contestant basis. Only \$150 after taxes.
• \$6,000 for oral sex = \$600 gross, \$300 after taxes.
• \$20,000 for sex = \$2,000 gross, \$1,000 after taxes.

The show filmed for 3 weeks, at a max payout of \$10,000 this is a yearly salary of \$173k. Not bad, but many of the contestants already out gross that. Francesca Fargo is estimated to have a net worth of over \$500k alone. Almost all of the contestants make money off their likeness or brand. Like Francesca, they model, sell clothing or act. Thus, building an Instagram following is directly connected to their revenue stream. Breaking the rules can help the contestants build that brand – losing out on \$300 now could be much more in brand awareness later. Those without brands seemed to leave early or not attempt anything “interesting” – see Madison – a late arriver who never coupled up.

But the rules weren’t clear on splitting the prize and contestants could have been under the impression only 1 or 2 would win. Under an expected value model the payout is the same: \$10,000 (\$100,000 *10% chance of winning). However, if you feel as though you’re a weak contestant you might estimate yourself at less than 10% probability to win. I think this was the case for sorority girl Hailey who broke the rules a mere two episodes in and had no interest in continuing.

I think there’s room for improvement in the show’s design. It was rather strange to reveal to contestants who the rule-breakers were so early in the show. This introduced social shaming as retaliation for rule breakers, speculation and investigation makes for far more drama. If the show was about temptation, why not focus more on the money or relationships? Maybe contestants can choose to eliminate their show’s squeeze – money AND sex as tests of genuine connection. Discounting seems like a great lever for drama injection – this week sex is 50% off! Adding new contestants didn’t seem to work, everyone had coupled up by the time they got there. Subtraction or an elimination is lot more fun.

Well, here’s to a solid season two. Hopefully, the show remains tongue in cheek. But not literally – that would be a rule violation.

### The Economics of Reality TV

In case you haven’t been paying attention, Netflix just re-discovered the reality/reality-doc genre. In 2 weeks:

• Tiger King
• To Hot Too Handle (international cast)
• The Circle (4 international versions already released)
• Love is Blind

I chuckled a bit when when Netflix remarked that they saw Fortnite as their biggest threat, but after 6 weeks in quarantine, they may have been on to something. Reality television is an awkward, but fun shot back at digital Travis Scott concerts.

The economics of reality shows have always favored cost. Casts cost \$0, 1 or 2 filming locations, short shoot time (sub 6 weeks), and strong vitality potential that front loads views (no one is watching the back catalogue of Survivor). But why now, 7 years since House of Cards launched?

I think this is a simple growth parable: they’ve hit diminishing returns on drama, and this is the next highest expected marginal value item on the menu. In other words, they’ve picked the low hanging, high value fruit already. Widespread international distribution expands the watercolor effect and achieves economy of scale. Shows with live elements like Idol and Got Talent are forced in country specific renditions which limit audience reach. Subtitle support for international versions of shows like The Circle helps it reach English audiences and vice-versa. Living in Europe, you’d be surprised of the appetite for English content across non-native speaking countries.

They’re innovating reality game design as well (or at least the 3rd party studios are). Technology play a huge role in all shows, almost always as a form of communication (or lack thereof). To Hot Too Handle introduces of the most interesting examinations of communal property dynamics: a group prize is reduced when individuals act in their short-term private interest. The Circle asks how people make decisions on limited information – contestants can only communicate over text.

If the last last years have been golden age of TV dramas, maybe the next 10 will be the golden age of reality TV.

### 1950’s Peruvian Coke and Gacha

In the 1950’s, Peruvian inflation forced Coca-Cola to charge more per bottle of Coke. Unfortunately, their vending machines required physical updating to accept a new and larger domination. Instead, Coke devised a probabilistic system: the machine would charge the same amount as before, but randomly refuse to give a bottle. This raises the expected price of a bottle Coke while forgoing any physical  updating. But a miscellaneous software engineer has a better idea: raise the price of Coke, but instead randomly give the money back.

The increase in price for a  given ‘bottle draw’ would equal the expected payoff of a lower priced ‘bottle draw’ that randomly refuses to give a bottle. This is an interesting solution to player frustrations in gacha (“I didn’t get anything of value when I opened a pack!”).

Anyone care to reckon which model one would perform better: Higher draw price but gives money back or lower draw price but sometimes doesn’t give anything?

### Players go to their highest valued LTV: Ads are Beautiful Pareto Exchanges

Previously, I wrote about ads as a way to monetize non-payers, but there’s more to the ad exchange and what I’ll coin as ‘portfolio pumping’. It’s like portfolio theory, but not really.

These terms reference two growing phenomenon in F2P games. King is at the forefront of portfolio pumping, in which a given firm pushes a player from game to game within the firm’s portfolio.

Unlike portfolio pumping, ad exchanges push players to another firm’s games. Companies like Scopely are more fond of ad exchanges.

Frequently, the ads being served are for competitor games. Why would a company show ads for its competitors? In addition, why would firms want players to move from one game in their portfolio to another? I argue the underlying explanation is Pareto Efficiency which is just a fancy term for trade.

$churned player LTV < ad revenue$
$acquired player LTV > ad cost$

It tends to be the case that a given company will engage in both ad buying and selling. The outcome of these ad exchanges are migrations of players to the games in which they have the highest LTV; the initial allocation doesn’t matter. This process takes place in high-speed auctions where firms are constantly in the search for the maximizing the equations outlined above. The decision rule for portfolio pumping is similar, but we add some special conditions, mainly the probability of simultaneous play.

$P(rLTV_{i} + nLTV_{i}) + P(nLTV_{i}) > rLTV_{i}$

Where,
$P(rLTV_{i} + nLTV_{i})$ is the probability of playing both games simultaneously. We add up both of the LTVs in this case.
$rLTV_{i}$ is the remaining LTV in the old game for the ith player, while nLTV is the LTV for the new game for the ith player.

This must be bigger than $rLTV_{i}$ for profitability.

Of course, there are ways to play with this. Wooga tried altering portfolio game prompts during a player’s lifespan but found no effect.1 King continues to portfolio pump but dropped ads in Candy Crush Saga.

It’s a goddamn gorgeous process that should litter econ textbooks like lighthouses and lemons.

### Re-rewriting Economic History

Will Luton argues on the dangers and solutions to F2P inflation over at gameindustry.biz.

While there are some missteps in the opening of the article, Will makes a powerful and elegant point:

…a sale can only be considered profitable if the net revenue from the start of the sale until resource equilibrium, and so demand, is restored is more than if the sale hadn’t been run. For well run sales in games with well balanced economies this should always be true.

Sales flood the economy with resources via shifts along the demand curve. Holding all else equal, this is modeled as a move from P1 to P2.

The tricky part, not found in the textbook model, is time. Unlike say, refrigerators, a durable good, virtual currency is a consumable good. This means we expect repeat purchases, similar to say, gasoline. Sales in this sense pull revenue forward by changing purchase ‘schedules’ more so then a durable good. The sales are only profitable if the sale sinks resources players would have never sunk otherwise (net positive sink). In games, this is achieved this achieved via live ops. This model explains how Supercell runs their games; it’s no coincidence that Clash Royal is the first Supercell game to have sales and real live ops while their other titles have little of either. Introducing one without the other keeps net sink flat in the long run by shifting intertemporal time preferences rather than increasing the size of the ‘sink pie’ so to speak.

Progression is another confounding variable. Holding all else constant, a given item is worth less for each additional level a user is at. This is simply an artifact of rising difficulty (in the form of stronger enemies, more experience to level up etc). As a result, sales make late game players in different while making early game players better off.

The insight Will offers is that sometimes this is an advantage by changing the progression path of newer players to a higher equilibrium then current late game players previously had.  This allows new players to ‘catch-up’. This sounds a lot like the Solow model. Yes, that Solow model 1. I don’t think Will models this correctly, however, as each player is not on a discrete curve as his graph on the left depicts. Even without inflation, the graph on the right is an accurate picture of a given game economy.

Consider two possible goods that could be put on sale (and thus inflated) from Clash of Clans: a builder or gold. The builder is a dramatically better purchase because it allows for more output per unit of gold or elixir (increase in technology). This shifts the growth rate of a given player up. On the other hand, the gold is a small one-time increase in capital stock that won’t scale with the game. For designers, this offers the chance to use sales as strategic instruments to alter the metagame. By offering Clash of Clan players discounts on a builder, players converge and then exceed the GDP of elder players. A sale of gold, however, merely ‘jumps’ the GDP of players without changing the long-run growth rate. This means designers can either jump the point along which new players are on the progression curve or they alter the new player curve entirely.

Back again

Unfortunately, this can deter some investment by changing inflation expectations. If players know a given dollar will have increased purchasing power later on, why make the investment now? Indeed, a 30+ paper written by Game of War players and subsequent boycotts attest to the negative side effects of perpetually trying to catch players up.

Careful consideration and analysis can make sales a valuable gameplay tool as much as they are a business one.

### Eric Seufert’s best F2P blog post isn’t about F2P

Everyone’s favorite former Rovio employee is a prolific writer on F2P games; the closest we have to a Fukuyama. Seufert has covered a range of topics, but none more important than internal organization.

Seufert argues for a number of institutional policies to surround analysts with within an organization. Frequently, analytics and data are as much about the appearance of sophistication as they are actual value adds. This need not be the case. The confusion arises over where the value of data lies. Perhaps ironically, data’s value doesn’t lie in the data, but rather in the data analyst.

In most organizations, analytics reports to product teams, a mistake, Eric argues. Often product managers face the principal – agent problem: their incentives and the companies do not align. Product managers want to successfully manage products and will present the narrative they are doing so. This is inefficient for companies who often wish to assess the true performance and trajectory of a portfolio. When an analyst’s career path depend on a product manager their narratives will often match. With organizational independence from product teams, analyst’s incentives align closer to the companies, providing more objective analysis.

Not just an accountability watchdog, real analyst value revolves around the ability to drive product roadmaps. At it’s highest order, analytics is a forward looking discipline, not a backward looking one. By experimenting and studying human behavior, analysts find levers that pull certain responses.  This creates opportunities to exploit these levers. Do currency pinches increase monetization? Are new gotcha characters or new levels driving revenue? Should we invest more in reducing load times or UI changes? Using theory driven empirical investigation analysts can move companies towards better outcomes than competitors. If organizations don’t allow analysts to pursue these questions, they’ll become cheerleaders for product teams. On the other hand, if first order information (RR, ARPU) is not accessible or automated, analysts will forever be running the hamster wheel of reporting. This is one of the more overlooked points Eric argues for.

I think this suggests a dual mandate of analysts: (1) accountability of features and (2) what features are worth developing. This creates a natural tension of not only playing the role of watchdog to product managers but partners as well. It is the duty of good analysts to navigate this relationship successfully.

### F2P Demand Curves Are Weird, Just Ask Levitt

Steve Levitt, the last price theory samurai, and John List, future nobel prize winner, have published a paper on free to play economics.

In a textbook neoclassical experiment, Levitt alters the quantity of Candy Crush hard currency at a given price point. While economists generally think of price variation as the way of deriving demand curves, quantity variations are just as legitimate a tool.

Despite a sample size of over 15 million and a wide range of quantity convexity (80% variation across variants), all quantity discounting schemes produced similar revenue. Levitt concludes by commenting,

“…varying quantity discounts across an extremely wide range had almost no profit impact in the short term.”

The interesting and little explored result indicates that,

…almost all of the impact of the price changes was among those already making a purchase; radical price reductions induced almost no new customers to buy…

This suggests free to play games are made up of two groups of users: purchasers and non-purchasers. This means the decision of becoming a customer is exogenous, there is no ability to convert non-customers to customers  i.e. this is decided outside of the game.  Put another way, non-customers are perfectly price inelastic and customers are perfectly price elastic. Indeed, industry research collaborate this.2

Interesting, but is it actionable?

Were this to hold, it suggests a number of results. The first is that product manager’s ability to monetize non-customers (99%~ of users) will not come from IAP, but rather other forms. This may help explain why F2P ad revenue and incentivized video continues to show YoY growth.3 4
Furthermore, product managers should consider experiments exploring the maxima point of ad frequency. Given that there’s a trade-off between retention and ad-frequency there exists an optimal ad frequency point.

With little chance of non-customers converting to customers, product managers should worry less about increased ad frequency turning off potential customers.

The final result suggests the ROI of trying to raise the LTV of customers exceeds that of trying to raise the new customer creation rate. Product managers should develop roadmaps in accordance.

### How to Measure Whales

You’ve soft launched your game, done a UA push, and a string of hope appears. Against all odds, a dominant cohorted ARPU curve emerges! Is this this an anomaly or have you caught a whale?

The first way to examine this is to perform cointegration tests between the cohorted ARPU curves, testing for stastistical significance. It may be true the difference in the curves are real, but that doesn’t answer if you’ve caught a whale.

In 1905, Michael Lorenz developed a method for measuring relative inequality between nations known as the Lorenz curve.

The F2P application is to define wealth as revenue (either on a daily or game level) and players as the population in the context of free to play games. By measuring how bent inwards a cohorted Lorenz curve is relative to other cohorted Lorenz curves we can measure the ‘whali-ness’™ of different cohorts. Even better is how this reduces to a single metric – the gini coefficient. A gini coefficient of zero indicates a perfectly equal distribution of income, 10% of the population owns 10% of the wealth, 20% of the population owns 20% of the wealth and so on and so forth. A gini coefficient of 1 is the exact opposite – a single person owns 100% of the wealth.

This translates to what % of players are responsible what % of the revenue. Measuring gini coefficients across games rather than cohorts gives more insight into how a particular game monetizes – whether it’d be whale, dolphin, or minow driven.

Actionable insights might include how effective introducing ads could be. A high gini coefficient (very few players are responsible for revenue) might mean there’s a more fertile base to monetize on.

The main insight, however, is further understanding. It’s clear that success can come about in drastically different ways in free to play games, the gini coefficient is simple way to measure that.

### The Content Problem and the Death of Level Designers

F2P is as much of a design choice as it is a business choice. Given this, F2P has its own set of design challenges among  which is the content problem.

Developers will only continue making additional content until the benefits are greater then the costs. This is specified when

`expected marginal revenue from content > development costt + opportunity costt`

where

`development costt is the cumulative cost by time of release (t)`

but if

`User Acquisition Rate (UAR) < Churn Rate (CR)`

there’s a shrinking pool of buyers which only widens at t+1. This is the essence of the content problem: how do we create content fast enough to curtail churn and while minimizing development costs?

The genius of PvP (Player v Player) environments is how they necessitate the emergence of a meta-game. In mathematics, Player vs Environment (PvE) resembles the field of optimization where strategies are static – one and done. PvP environments, however, resemble game theory models where it has been shown strategies evolve in an evolutionary process. This means equilibrium in PvP environments is constantly being reshuffled with each balance change; the search for dominant strategies in an ever shifting equilibrium is the game itself.

It’s been 4 years since the launch of Clash of Clans and there continue to be oodles of strategy videos. Supercell is constantly debuffing and buffing different units which makes some strategies more successful than others and by trial and error players expose this.

The push for PvP environments has seen the emergence of ‘Systems Design’ and the demise of a Level Designers. With few exceptions, linear and deliberate gameplay has gone the way of Spaghetti Westerns.

On the other hand, a different type of PvE has found ways to combat the content problem. For example, Trials Frontier adopted meaningful level mastery with a touch of PvP. This is achieved via quests that revisit locations, stars, leaderboards, mission rewards, and gameplay that rewards depth (back/front flips can improve my times!). That said, PvE shares a smaller piece of the pie than it once did. This trend will only continue as F2P marches into the console and PC arena.

### Get more life out of your Lifetime Value Model! A discussion of methods.

Predicting the average cumulative spending behavior or Lifetime Value (LTV) for players is incredibly valuable. Being able to do so helps figure out what to spend on User Acquisition (UA). If a cohort of players has an LTV of \$1.90 and took \$1 to acquire then we’ve made money! This helps evaluate how effective particular channels of advertising are as we’d expect different cohorts of players to have different values. Someone acquired via Facebook may be worth more then some acquired via Adcolony.

But wait there’s more!

My argument in this post is that LTV has great deal of value outside of marketing. In fact, LTV might have parts more valuable then the whole. How to predict LTV can adopt numerous approaches and each approach has associated benefits. Remember, there doesn’t have to be just one LTV model!

Consider four requirements we’d want out of an LTV model:

1. Accuracy

LTV predicted should be the LTV realized. Figuring out upward and downward bias in your coefficients is important here. This gives insight into the maximum or the minimum  to spend on UA depending on the direction you suspect your coefficients are biased towards.1

2. Portability

Creating models is labor intensive and even more so when doing so for multiple games. There are particular LTV models that sweep this aside called Pareto/Negative Binomial Distribution Models (NBD). Since they’re based only on the # of transactions as well as transaction recency they don’t require game specific information. This means you can apply them anywhere!

3. Interpretability

This one’s big and perhaps the most overlooked. Consider the Linear * Survival Analysis model approach to LTV. The first part is to predict when a particular player will churn. By including variables like rank, frustration rate (attempts on particular level), or social engagement we gain insight in what’s retaining players. This type of information is incredibly valuable.

1. Scalability

If it’s F2P then there are going to hundreds of thousands to millions of players (you hope). I’ve seen some LTV approaches that would take eons of time to apply to a player pool of this size, our LTV should scale easily.

So how do the different approaches stack against one another?

 Accuracy Portability Interpretability Scalability Pareto/NBD2 / x x ARPDAU * Retention3 x x Linear * Survival Analysis4 x x x Wooga + Excel5 x Hazard Model6 x x x

Parteo/NBD is great, but it’s hard to incorporate a spend feature (it just predicts # of transactions).7 A small standard deviation in transaction value gives this model a great deal of value and something to benchmark against. This model also makes sense if data science labor is few and far in between.

ARPDAU * Retention is probably the approach you’re using; it’s a great starter LTV. If marketing/player behavior becomes more important, the gains to scale from a approach beyond this start to make more sense.

Wooga + Excel just doesn’t scale which kills its viability, but it’s conceptually useful to understand.

Linear * Survival Analysis  gives a great deal of interpretability that also sub-predicts customer churn time. This means testing whether the purchase of a particular item or mode increases churn time is done within the model. The interpretability of linear models also means it’s easy to see different LTV values for variables like country or device.

There are many, many different approaches beyond what’s been laid out here. Don’t settle on using just one model, each has costs and benefits that shouldn’t be ignored.