How many times must I send an email before it’s opened?

How many times do I have to send an email to my customers before I’m confident a certain percentage have opened it?

Like most businesses today, the Retail Shop maintains a customer email list. Each week, Michelle, the store’s manager, sends a promotional email message to these customers. Historical data show that, on average, 15% of all Retail Shop customers open these promotional emails. This is known as the store’s “Open Rate”.

Michelle, being a smart marketer, is concerned that her promotional message is not being opened by enough customers. She asks herself, “how many times must a promotional email be sent before it has been opened by at least 50% of recipients?”

On the face of it, this seems like a difficult problem to solve. However, using basic probability theory makes finding a solution surprisingly simple.

Balls and Cups

We can’t know all the factors that affect whether or not a customer will open an email, so we consider it to be a random event. A solution to Michelle’s question can then be found by using methods analogous to tossing a coin, rolling dice, or pulling balls from an urn.

By transforming Michelle’s conundrum into a ball tossing experiment, we can, with certain limitations, arrive at a best case answer to her question.

Our experiment is designed as follows: Set out 100 cups. Take 15 ping-pong balls, toss them in the air, and record into which cups the balls land. The rules of our experiment require that for each toss, a ball may land in at most one cup and no cup may contain more than one ball.

In this case, Michelle’s question becomes: How many times would the balls need to be tossed before 50% of the cups have caught a ball? Or, how many tosses does it take for any given cup to have a 50% chance of catching a ball? These are similar to asking how many times you must toss a coin before a head appears, or how many times you must roll a die before a six appears. They are all answered by assuming the number of events follows what is known as a geometric distribution.

To understand this, let’s define the following:

p is the probability a ball lands in any given cup after one toss.
q = (1 – p), is the probability a cup does NOT catch a ball after one toss.
N is the number of ball tosses.
X is the probability a ball lands in any given cup after N tosses.

Using the geometric distribution, the equation for determining the probability that a ball lands in any given cup after N tosses is:

1 – qN = X

If, the email Open Rate is p = .15, and we’re interested in X = .50, we get:

1 – (.85)N = .50

(.85)N = .50

Then solve for N,

log(.85)N = log(.50)

N log(.85) = log(.50)

N = log(.50) ÷ log(.85)

N = 4.265

So, given 15 balls and 100 cups, p = .15 and q = .85, it takes 4.265 ball tosses before the chance of a ball landing in any given cup becomes 50%. Since we can’t perform fractional tosses, 5 ball tosses would be required and the probability rises to approximately 56%.

Michelle’s Answer

Michelle’s question was, “how many times must a promotional email be sent before it has been opened by 50% of recipients?”

Through the use of probability theory and a simple ball tossing experiment, we’ve arrived at a “best case” answer. The specific reasons for describing the answer in this way is beyond the scope of this article.

However, whether it be flipping coins, rolling dice, or tossing balls, probability theory makes certain assumptions about the nature of the generating devices (coins, dice, or balls). First, that as instruments of chance they are fair and unbiased. A coin that is assumed to be fair (head and tails each have a 50% chance of appearing) but always comes up heads is neither fair nor unbiased. Second, that each event (a coin toss, a roll of the dice, or a ball toss) is independent of any other event. That is, no past event (flip of a coin, roll of a die) effects the outcome of any future event.

In the case of opening an email and the tendencies of email recipients, it’s safe to say that, as a general matter, they adhere to neither of the two assumptions described above. Not Everyone has the same probability of opening an email, nor do all emails have the same probability of being opened.

These realities do not preclude us from using a ball tossing experiment to solve Michelle’s problem, but they do require us to characterize the solution as a “best case” scenario — a very reasonable place to start.

It is interesting to note that our model does not depend on the content of the email, only whether or not it was opened. So, if Michelle wants to be sure that her customers have a 50% chance of hearing about a particular offer, she must include that offer in each email sent. But, if she’s only interested in assuring that 50% of her customers hear from her, she can vary the content as she likes.

The generalized form of the Email Confidence Equation is:

N = log |X-1| ÷ log(q)

Where X and q are as defined above. Inputting X and q, allows this equation to be used to find N under a variety of circumstances.

Sources: Wikipedia, Geometric Distribution