"Okay, it's almost time to wrap up the first Insights puzzle. Let me summarize the important insights, and point out the mistaken notions that may still be preventing some people from seeing the solution.

The Pure Math Question

The main puzzle is an existence problem in pure math – does there exist a strategy that guarantees that you - the player - can guess the larger number as your final choice after seeing one of the two numbers, no matter what two numbers the host selects?

The answer is yes, and this fact has been rigorously proved here for the unbounded case. The barriers to seeing the solution are:

1) The pair of numbers A and B have already already chosen for you by the host using an unknown 'black box' process. There is no requirement that they come from a uniform distribution. In fact, a uniform distribution over the entire real line is impossible.

2) There are an infinite number of winning strategies, all of which involve sometimes switching your guess to the other hand based on the number seen. One way this can be done involves generating an arbitrary guide number G having a non-zero probability of being present in any interval on the real line. There are an infinite number of distributions that have this property, and they can be based on trigonometric ratios, exponentials, Gaussian curves or even by simple coin-tossing. The guide number G must also be generated independent of A and B.

3) The solution only proves that the probability of winning is more that 50 percent. The actual probability does not need to be calculated, and cannot be calculated without knowing the host's number generating algorithm. The exact (non-zero) probability of G lying between any given interval can however be calculated, based on the actual distribution the player uses.

4) For any qualifying distribution selected by the player, the probability of winning may be extremely close to 50% for a given pair of numbers, but it is always higher than 50% by a finite amount.

5) The infinity of the real line has the same cardinality as the infinity of any interval within it (aleph-one). So for any pair of numbers A and B on the real number line, we can find a pair of numbers A1 and B1 within any bounded interval such that the probability of a guide number lying between A and B under a given distribution on the real line is exactly equal to the probability of a guide number lying between A1 and B1 under a uniform distribution.

6) As a consequence of the above, there is not much of a difference in the practicality of achieving success between the unbounded and bounded cases. This surprising fact is explained in detail below.

What about winning in real life?

I think one of the reasons that some people have difficulty with the unbounded case is that the infinity of the real line seems so vast, that it seems impossible to win in a real-world game. This is certainly true in an adversarial game, where the host is actually trying to prevent the player from winning. But as a consequence of point 5 above, this is also true in the bounded case, as we saw in bonus problem 3. In the adversarial case, whether bounded or unbounded, the host can choose two numbers that are so close together, say 10^-30 apart in the bounded case, that the player's chances of winning are reduced to very close to 50%. Only cosmic beings playing at the rate of a googolplex games per second and having all eternity can realize the win in such a case. However, this does not undermine the fact that the "pure math" winning strategy does exist even in the adversarial case. "What's the use of this?" a hard-nosed realist may ask. The pure mathematician will say "It's the principle of the thing" or as G. H. Hardy put it, "what is useful above all is technique, and mathematical technique is taught mainly through pure mathematics."

The utility of this technique bears full flower in the non-adversarial case, where the host is indifferent, as for example in a game show setting. When the host's pair of numbers are selected at random from any reasonable distribution over the entire real number line without explicit intent to cause the player to fail, the win in the unbounded case is as easy as in the bounded case. I carried out a simulation of 100,000 cases where the host's numbers A and B were drawn from the entire real number line using the function y=ln(x/(1-x)) with x being a uniformly distributed random variable between 0 and 1, where the player used G=tan(π(x-0.5)) to generate G. The player still won 62.7% of the time. In fact, because the median of both these distributions are centered around zero, the "median" strategy suggested by the very first commenter, Sarah L. of using 0 as the guide number succeeds, as expected, 75% of the time.

Besides having a common median, another problem with the above distributions is that they have a sharp peak very close to zero (neither of the two generated numbers above 10 in my simulation!). To correct for these problems, I changed the generating distributions to y=(10^15)^ ln(x/(1-x)) with a 70% bias towards negative numbers for the generating distribution and G=10^(tan(π(x-0.5))) with a 70% bias towards positive numbers for G. Now I obtained some hugely negative and positive numbers (absolute values of 10^150 or more in both cases). The player still won 65% of the time. The "dynamic median" strategy which I had suggested as the best for a repeated game won 75% of the time even though the median fluctuated wildly initially.

**So it is just as easy to win in the unbounded case with a non-adversarial opponent as it in the bounded case. Why should this be? **Well, any distribution that has appreciable probability density over a wide range on the real line will often generate a pair of numbers that are far apart, and there is a good probability that a guide number generated by almost any distribution will be able to lie between them.

I offer the following corrections to hitherto uncorrected comments made by people, not as criticism but to allow them and others to overcome false impressions. As they say in reminder letters, please ignore these messages if you have already corrected these impressions ☺.

Walt Donovan said (and this was echoed by Max Rockbin, Mark B, Matthew Kosak and others):

"Once you bound the range of A and B, though, then the probability of finding A<g<b becomes="" becomes="" greater="" than="" zero="" and="" non-infinitesimal,="" and="" you="" can="" then,="" and="" only="" then,="" achieve="">50% chance of success."

This is not true as we saw above. What is **more important than bounded vs. unbounded is adversarial vs. non-adversarial.** The results are the same in the bounded and unbounded cases.

Walt also said:

As the problem was originally stated, A and B are unbounded real numbers. Thus, the probability that one can find a G between A and B is an infinitesimal epsilon, greater than zero but smaller than any positive number. And the chance of success is 50% + epsilon, which converts to 50% as a real number.

This is not correct. There is no such thing as an "infinitesimal epsilon, greater than zero but smaller than any positive number." For any A and B, the epsilon is an actual positive number, with an infinity of smaller numbers still present between it and 0. The chance of success always remains measurably greater than 50%.

@Pauli

Benford's law is interesting and would have been relevant if the numbers were restricted to numbers with a fixed number of digits. This is not the case in this problem.

@sagef

I noticed in your code that you had used RANDBETWEEN(1,10). Didn't you mean RANDBETWEEN(0,10)? The former will give you success only 74.69% instead of 75% ☺

@Giovanni

Thanks! I agree, it is definitely a most fascinating problem. I think we are easily deceived not so much by statistical concepts, but by infinities. Most of the mental difficulties here relate to this.

The three mantras that need to be chanted ad infinitum ☺ are:

1. Om – There are no uniform distributions on the real number line.

2. Om – The probability of a number falling between A and B is not (B-A)/infinity. That quantity is undefined.

3. Om – The actual probability of a number falling between A and B depends on the distribution and can certainly be finite."

P.M. Author