in search of absolutely no information



It seems that most readers swallowed my arguments in the previous post, so I think it is time to turn the screw a few more times so to speak.



Again, we are confronted with a deck of 2 cards (we know that there are only black and red cards, no jokers), we take the top card and it is black. What is the probability that the remaining card is also black?



In the previous post we only considered two hypotheses (*):

H1 ... black cards only

H2 ... mixed deck

and we assumed p(b|H2) = 1/2.

This assumption bothers me now.



We don't know how the 2-card deck was put together and it could be that somebody made sure we would always find a black card on top, even for the mixed deck. So I think we need to split H2 into two different hypotheses.

H2a ... manipulated mixed deck: p(b|H2a) = 1

H2b ... random mixed deck: p(b|H2b) = 1/2



Again, relying on the principle of indifference we use an uninformed prior p(H1) = p(H2a) = p(H2b) = 1/3 and
find p(H1|b) = 2/5 (I recommend that you actually plug in the numbers and do the calculation).

In other words, the probability that the remaining card is also black (2/5) is now significantly less than
the probability that it is red (3/5).



But it is clear that we have not yet achieved the goal of "absolutely no information" in selecting our prior; The problem is H2b.

We have to split it further to take into consideration cases in between outright
manipulation and random selection. So we have to consider additional N hypotheses

H2k ... k=1 ... N with p(b|H2k) = k/N

and let N go to infinity (o).

As you can easily check, the large number of hypotheses for mixed decks results in p(H1|b) -> 0 and therefore
we have to conclude that using an absolutely uninformed prior the probability that the
remaining card is black is (close to) zero.



This remarkable result is due to the fact that there is only one way how the deck can consist of only black cards,
but there are many ways how a mixed deck could have been put together. Absolutely no information means certainty about the 2nd card in this case, thanks to the power of the Bayesian method (x).





(*) Nobody complained, but it was a bit sloppy to exclude H0 = 'red only' from the prior (which should be chosen before the black card was seen). But p(b|H0) = 0 and, as you can check, including H0 would have made no difference.



(o) The limit N = infinite would require the use of an improper prior and I think it is sufficient for our purpose to consider the case of large but finite N.



(x) I think this toy model will be very helpful e.g. in cosmology and multiverse research.


2 comments:

Jonathan Livengood said...

Just a quick re-iteration of an earlier comment. You say that the assumption that P(b | H2) = 1/2 bothers you now. It looks to me like you are bothered because you are freely interchanging:

(1) At least one of the two cards in the deck is black.

And

(2) The top card in the deck is black.

If "b" represents the first case, then we have no information about where in the deck the card comes from. So, the manipulated deck hypothesis (where a manipulated deck always has a black card on top) isn't relevant.

If "b" represents the second case, then the probability of a red-black deck is zero. In that case, we only have two live hypotheses (over possible states of the deck): black-black and black-red. The "right" answer then is that the probability is 1/2 that the other card is black.

Again, it looks like you are making an interesting point about how the decks are generated ... trying to sneak in frequentist considerations (which is commendable). But in that respect, you still don't have a no information prior. Here is an example:

You need to know how the decks were constructed. Here is a possible way: each deck is constructed by taking two cards at random from a big vat of red and black cards. But the vat of red and black cards contains a million black cards for every red card.

So, the thing that we have no information about is the constitution of the vat. That is, the relative frequency of black to red cards in the vat. The number of hypotheses here is infinite, and it isn't at all clear what the prior for black-black should be. (Maybe zero, but then the posterior will also be zero.)

Anyway, the thing I've seen Bayesians do is ignore the generating process -- the constitution of the vat -- and put down a no-information prior over the possible states instead. (I think that leads to a lot of silliness.)

wolfgang said...

>>each deck is constructed by taking
>>two cards at random from a big vat
>>of red and black cards.

I mention that in the previous post: If you assume that the 2-cards deck was randomly drawn from a 2N-cards deck then the probability for rb is slightly higher than bb.

But I think this leads away from an uninformed prior to even more frequentist card counting.