Recently I stumbled upon this math problem:

The positive integer N has a finite value but is unknown to us.

We are looking for a function f(X) which minimizes the sum

E = |f(0) - N| + |f(1) - N| + |f(2) - N| + ... + |f(N-1) - N|

for almost all N.

Notice that we do not know the value of N, so f(X) cannot depend on it, which eliminates the trivial solution f(X) = N among others.

In order to illustrate what I am looking for, consider as first example

1] f(X) = 0, which results in E = N

^{2}.

However, there is a better solution

2] f(X) = X, which results in E = (N+1)*N/2, which is less for almost all N (except N=1).

Unfortunately, I do not know the best solution f(X) and this is where you are invited to leave a comment to help me out.

But I do have strong evidence (i.e. a numerical test up to large values of N) that

3] f(X) = X + sqrt(X) is an even better solution.

So what is the motivation for this problem and why the title for this blog post?

Consider a process or phenomenon which exists already for X years and we try to estimate the total lifetime N without any further information. So our estimate can only depend on X and we try to find a function which minimizes the total estimation error E as described above; every year we make an estimate f(X) which is wrong by the amount |f(X) - N| and we try to minimize the sum of those errors. In some sense this is a variation of the infamous 'doomsday argument'.

It is now obvious why 1] is a bad solution and 2] is much better. Btw the function f(X) = 2*X would give the same total error E minus a small constant, so whether we assume that the process ends immediately or estimate that it will last twice as long (as it already did) makes no significant difference.

Btw the solution 3] creates a paradox: The best estimate for the life expectancy seems to depend on what units one uses: We get a different result if we calculate with days rather than years.

added later: If I assume f(X) = c*X, perhaps motivated by that paradox, then I can show that c=sqrt(2) minimizes E for large enough N. However, the function

4] f(X) = sqrt(2)*X + sqrt(X) seems to be an even better candidate, but I do not know how to determine a,b to minimize E for f(X) = a*X + b*sqrt(X) or determine the general solution for f(X).

added later: There are better ways to illustrate the problem; e.g. a box contains an unknown number N of candies. You take out one after another and at each step X you have to guess N. At the end there is a penalty proportional to the sum of your wrong guesses.

Perhaps a "deeper" example considers a Turing machine, which performed already X steps and we try to guess after how many steps N it halts.

The reason these examples are "better" is that there is no change of physical units (e.g. from years to days) that would affect N.

## 6 comments:

I'm sure I'm going to look ridiculously foolish by asking this question, but I don't understand how when you let f(X)=X, then E= the sum of the integers from 1 through N?

Well, when f(X) = X then

f(0) = 0 , so the first term is

|0 - N| = N,

f(1) = 1 so the next term is |1-N| = N-1

and so on and so forth.

So the sum E is

N + N-1 + N-2 + ... + 1

and if I have this right this sum is

(N+1)*N/2

How do you assure yourself that the time step f(1)-f(0) is less than N? If you know how to choose a time step that is less than N, can't you then make N arbitrarily large by choosing a small enough time step? If you can make N arbitrarily large and can prove that E is minimized by f(X)=sqrt(2)*X for sufficiently large N, then why not use a time step that makes N large and use f(X)=sqrt(2)*X to minimize E?

Lee,

I do not really understand your questions, but let me explain a simple example:

Some guy starts a small business. He does not really know anything about it but he tries to estimate how many years N it lasts from how many years X it already exists.

Lets say he starts in 1990 and his company lasts until 2000 (but he does not know that initially).

He is a pessimist and uses f(X) = X.

So initially, in 1990, he estimates that his company will go bankrupt quickly, f(0) = 0.

The error is |0 - 10| = 10.

The next year, 1991, he assumes again immediate bankruptcy f(1) = 1, so the error is |1 - 10| = 9

and so on and so forth.

The total error is 10 + 9 + .. + 1 = 11*5 = 55

The question is, could he have estimated N in a better way?

It turns out, yes, for almost all N using the function f(X) = sqrt(2)*X gives a lower total error.

I suspect that sqrt(2)*X + sqrt(X) would give even better estimates ...

Did this help?

I admit that the whole problem is a bit weird, because he tries to estimate N without knowing anything except X.

You could easily turn this into an even "deeper" problem by asking "what is the expected running time of a Turing machine (which I do not understand) if it made already X steps?"

But this is the beauty of this problem: Different functions do give different total errors and so there are better ways to estimate N than others.

Also, this problem depends on the behavior for N -> infty, but the "main contribution" comes from small values of X, so the usual methods for series do not work here ...

Lee,

I think I understand now what your question(s) is about. Your f(1) - f(0) threw me off 8-) initially.

Yes, you can make N large by using a different time scale (e.g. seconds instead of years).

This would not be a problem if f(X) is a linear function, then your estimate does not depend on the time scale.

It would be nice if linearity of f(X) would be a result of the

minimization of the error E, but I am pretty sure that it is not.

But one can impose it as a constraint, i.e. as additional information.

Otherwise, if I do not know that 1 hour = 60 minutes my best estimate may differ.

Post a Comment