tossing biased cyber coins



I decided to test this comment to the previous blog post with a numerical simulation and the result was quite surprising (at least to me).

I wrote a simple C program, which generates n-state HMMs randomly (*) and then runs them N times (generating a sequence HTHH...), followed by another N cyber coin tosses.

The question was if a 'simple strategy' can predict from the first sequence the bias of the second sequence. In the following, I performed the experiment for n = 10, 20 ...
with N = 100.

The 'simple strategy' predicts a bias for Head if the number of Heads in the first sequence exceeds 60 = 0.6*N and I registered a success of the prediction if the
number of Heads was indeed larger than 50 = 0.5*N in the following sequence.
The experiment was repeated 10000 times for each n and the graph below shows the success rate as a function of n.







Notice that the success rate is well above 50% for n < N, but even for n > N it seems that the first sequence can predict the bias of the second to some extent.
This is quite amazing, considering that for n > N the first sequence has not even encountered all the possible states of the HMM.

Obviously, as n increases (and N stays fixed) the success rate approaches 50%
and if one does not know n this leads us back to the questions raised in the previous post. But the success rate for n < N and even n > N is much higher than
I would have expected.

The next task for me is to double check the result (e.g. against the literature) and to do some more experiments.



------



(*) An HMM is really described by two tables. The one stores the probabilities for H vs. T in each state s = 1, ..., n. The program initializes these probabilities with uniform random numbers between 0 and 1.

The other table stores the n x n transition probabilities and the way my program assigns these probabilities is to first assign uniformly distributed random numbers and then normalizing probabilities by multiplying the probilities p(i,j) to get from state i to j so that sum_j ( p(i,j) ) = 1. There is some ambiguity in this procedure
and I guess one could choose a different measure.


homework



In order to illustrate some comments made to my previous post, I suggest
the following homework problem:



We are dealing with a Hidden Markov model with n internal states, which produces as output a sequence of Hs and Ts. We know that it is ergodic (*), we are told that it is biased such that either p(H) > 2p(T) or p(T) > 2p(H) (if it just runs long enough) and we observe a sequence HTHHT... of N ouptputs.



How big does N have to be as a function of n to determine with some confidence if we are
dealing with one or the other case?

If we do not know n (**), how large would N have to be?

And how does an uninformed prior look like in this case? Will it have to be an improper prior as long as we do not know n or should we somehow weight with the (inverse of) the complexity of the HMMs?



(*) Every internal state can eventually be reached and every internal state
will eventually be left, but the transition probabilities might be small.



(**) This makes it somewhat similar to the case of the previous blog post, but of course, in the previous post we do not know anything about the underlying model, other than that there is a strong bias.


biased



A simple coin toss and all we know is that it is strongly biased.
What is the probability to get Head at the first throw?

Since we have no further information to favor Head or Tail, we have to assume
p(Head) = p(Tail) and since p(Head) + p(Tail) = 1 we
conclude that p(Head) = 1/2. Easy as pie.

But kind of wrong, because the only information we
actually do have about the coin is that p(Head) is certainly not 1/2,
because the coin is biased.


esse est percipi, part 4



As I tried to show in the previous blog posts, the interpretation of quantum physics is to a large extent a debate
on how to understand "psychophysical parallelism" and how to assign (our) conscious experience to a wave function and/or its components.
This is one reason why most 'real physicists' usually stay away from this topic.

But if you want to read even more about it, I recommend the following as starting points:



H. D. Zeh: "Epistemological consequences of quantum nonlocality (entanglement) are discussed under the assumption
of a universally valid Schroedinger equation in the absence of hidden variables.
This leads inevitably to a many-minds interpretation."



plato about many minds: "... one might well conclude that a single-mind theory, where each observer has one mind that evolves randomly given the evolution of the standard quantum mechanical state, would be preferable."



Dowker & Kent (also here): "We examine critically Gell-Mann and Hartle's interpretation of the formalism,
and in particular their discussions of communication, prediction and retrodiction,
and conclude that their explanation of the apparent persistence of quasiclassicality
relies on assumptions about an as yet unknown theory of experience."



Bernard d'Espagnat: "The central claim, in this paper, is that the Schroedinger cat – or Wigner’s friend –
paradox cannot be really solved without going deeply into a most basic question,
namely: are we able to describe things as they really are or should we rest content
with describing our experience?"



Last but not least, the webpage of Peter Hankins is not a bad reference for the
traditional discussion of conscious entities and the mind-body problem.