It seems that most readers swallowed my arguments in the previous post, so I think it is time to turn the screw a few more times so to speak.
Again, we are confronted with a deck of 2 cards (we know that there are only black and red cards, no jokers), we take the top card and it is black. What is the probability that the remaining card is also black?
In the previous post we only considered two hypotheses (*):
H1 ... black cards only
H2 ... mixed deck
and we assumed p(b|H2) = 1/2.
This assumption bothers me now.
We don't know how the 2-card deck was put together and it could be that somebody made sure we would always find a black card on top, even for the mixed deck. So I think we need to split H2 into two different hypotheses.
H2a ... manipulated mixed deck: p(b|H2a) = 1
H2b ... random mixed deck: p(b|H2b) = 1/2
Again, relying on the principle of indifference we use an uninformed prior p(H1) = p(H2a) = p(H2b) = 1/3 and
find p(H1|b) = 2/5 (I recommend that you actually plug in the numbers and do the calculation).
In other words, the probability that the remaining card is also black (2/5) is now significantly less than
the probability that it is red (3/5).
But it is clear that we have not yet achieved the goal of "absolutely no information" in selecting our prior; The problem is H2b.
We have to split it further to take into consideration cases in between outright
manipulation and random selection. So we have to consider additional N hypotheses
H2k ... k=1 ... N with p(b|H2k) = k/N
and let N go to infinity (o).
As you can easily check, the large number of hypotheses for mixed decks results in p(H1|b) -> 0 and therefore
we have to conclude that using an absolutely uninformed prior the probability that the
remaining card is black is (close to) zero.
This remarkable result is due to the fact that there is only one way how the deck can consist of only black cards,
but there are many ways how a mixed deck could have been put together. Absolutely no information means certainty about the 2nd card in this case, thanks to the power of the Bayesian method (x).
(*) Nobody complained, but it was a bit sloppy to exclude H0 = 'red only' from the prior (which should be chosen before the black card was seen). But p(b|H0) = 0 and, as you can check, including H0 would have made no difference.
(o) The limit N = infinite would require the use of an improper prior and I think it is sufficient for our purpose to consider the case of large but finite N.
(x) I think this toy model will be very helpful e.g. in cosmology and multiverse research.
The main purpose of this blog post is to illustrate my ignorance of Bayesian statistics(*), discussing a very simple game with a 2-card deck. By the way, we only consider black and red cards, no jokers etc.
So we begin with the 2-card deck and draw one card - a black card. The question is 'what is the probability that the other card is also black?'.
Fortunately we only need to consider two different hypotheses:
H1 ... both cards are black.
H2 ... a mixed deck (one black, one red).
We update our probabilities using the famous formula:
p(Hi|b) = p(b|Hi) * p(Hi) / p(b)
where b indicates the 'black card event' and p(b) is short hand for the sum p(b|H1)*p(H1) + p(b|H2)*p(H2).
Since we have no further information we use an uninformed prior which does not prefer one hypothesis over the other,
in other words:
p(H1) = p(H2)
and using p(b|H1) = 1 and p(b|H2) = 1/2 we get
p(H1|b) = 2/3 and p(H2|b) = 1/3. (I recommend that you actually plug in the numbers and do the calculation.)
I admit that there is something weird about this result: We have two cards, we pick one and after updating our probabilities have to conclude that the other is more likely being black than red, actually twice as likely.
The issue seems to be our prior, i.e. the choice of p(Hi).
Indeed, if we draw the 2-card deck randomly then the probabilities that it contains bb, br, rb and rr should be the same. We can throw out rr, which leaves us with a 2:1 majority of mixed decks and should therefore use p(H1) = 1/3 and p(H2) = 2/3. As you can check this resolves the weirdness, we get p(H1|b) = p(H2|b) and thus the probability that the 2nd card is also black would be the same as the probability for red.
But can a Bayesian accept such card counting?
And it gets worse: If the 2-card deck was drawn from an initial deck of 2N cards then the probabilities of bb and br are not exactly the same. The probability of the first b is indeed 1/2 but the probability of a 2nd b is lower, (N-1)/(2N-1), and therefore a mixed deck seems actually preferred, depending on the number N, which we don't know. Should we really conclude that red is slightly more likely after seeing black?
Do we have to sum over all possible N with equal weight to get a truly uninformed prior?
But I am sure a true Bayesian would reject all those card counting arguments which smell quite a bit of frequentist reasoning. A truly uninformed prior means that we have no information to prefer one hypothesis over the other. There is a difference between not knowing anything about the 2-card deck and knowing that it was randomly selected. Therefore the symmetric choice p(H1) = p(H2) is the true uninformed prior which properly reflects our indifference and we have to live with the asymmetry of probabilities for the 2nd card.
(*) A true Bayesian would calculate the probability that this post contains sarcasm taking into account the existence of this footnote.