Re: Hempel's paradox redux

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Thu Sep 15 2005 - 14:02:10 MDT


Ben Goertzel wrote:
> Hey Eli,
>>> http://www.goertzel.org/new_essays/hempel.htm
>> I find your lack of faith disturbing.
>
> Well, my wife and I found the flaw in a brief conversation before we went to
> sleep last night but I was really tired and I didn't feel like dealing with
> the computer anymore ;)
>
> As she pointed out, the situation with states 2 and 4 in my little story
> about the midget is a lot like the situation where you have two coins, one
> with two heads and one regular coin with a head and a tail. If you are told
> that one has been tossed and the result was a head, then the odds are 2:1
> that the the coin in question was the double-headed coin. [The heads are
> W's, the tail is a B]
>
> Duuuuhhh...

That was one mistake, yes.

Marcello caught the first one, but his algebra may have been hard for
some to follow, so I now explain it intuitively.

1)

A mathematician tells you that he has two children. You ask, "Is one of
your children a boy?" and the mathematician answers "Yes." What is the
probability that the other child is a girl? Two-thirds. Why? Because
the possible birth orders are BB, BG, GB, and GG, each equally probable.
  For the first three the mathematician certainly answers "Yes" to your
question, and for the fourth the mathematician certainly answers "No".
So since the likelihoods are equal for the first three probabilities,
the mathematician's answer doesn't change your prior belief that the
mathematician is twice as likely to have a girl and a boy as to have two
boys.

But suppose that you randomly sample a child, and the child is a boy.
It is now equally likely that the other child is a boy, or that the
child is a girl, because BB is twice as likely to produce a boy in a
random sample as BG or GB.

Still another way of looking at it is that you have incorrectly
decomposed your uncertainty into atomic possible worlds. Atomic
possible worlds must slice reality as finely as possible, so an atomic
possible world includes, as background state, the values of 'random'
variables. "One white raven and one white nonraven" is not an atomic
possible world, but a set of possible worlds, because it fails to
specify the value of an important random variable. An *atomic* possible
world, relative to your problem space, is "One white raven and one white
nonraven exist, and 'random' sampling will produce a white raven". By
the definition of random sampling, this possibility contains half the
probability mass within the set of possible worlds "One white raven and
one white nonraven." This atomic possible world is ruled out if random
sampling of a white object produces a nonraven, and the probability mass
collectively within the set "One white raven and one white nonraven"
decreases accordingly.

2)

> "If the midget gives you a nonblack entity when asked, that means the bag must be in states 1, 2, 3 or 4."

The bag must be in states 2, 3, 4, 6, or 7.

3)

> "So if the ratio is unchanged via the observation of a white nonraven, i.e. if
>
> P_prior(black|raven) / P_prior(white|raven) =
>
> P(black|raven & I have chosen a random white entity and found it to be a nonraven) /
>
> P(white|raven & I have chosen a random white entity and found it to be a nonraven)
>
> (as we see from the fact that b_2/b_4 = a_2/a_4)"

?? This doesn't follow at all. Previously, you would have to calculate
your prior probability that a randomly sampled raven would be black or
white by taking into account the probability mass in every one of your
(sets of) possible worlds 1 through 7. You cannot calculate your prior
probability using worlds 2 and 4 alone.

4)

> "So there is no Hempel paradox in this case. Here we are able to observe evidence increasing our estimate of
>
> P(non-raven|non-black)
>
> without affecting our estimate of
>
> P(black|raven)"

Hempel's Paradox arises because the statements "All ravens are black"
and "All non-black objects are not ravens" are logically equivalent,
under the standard mathematical interpretation where "All ravens are
black" is vacuously true if no ravens exist.

However, it is not the case that p(black|raven) is always locked to
p(~raven|~black). Although you failed to construct such an example, it
can be done. For example, suppose there are ten objects: a black raven,
a white raven, and eight nonblack nonravens. Here p(black|raven) = 1/2
and p(~raven|~black) = 8/9. Now suppose a different set of ten objects:
two black ravens, two white ravens, and six nonblack nonravens.
p(black|raven) = 1/2, but p(~raven|~black) = 6/8 = 3/4. Suppose I am
not sure which of these two (sets of) possible worlds I am in, and I
randomly sample a nonblack object and it is a nonraven. This is
evidence that I occupy the first world, so (after renormalization)
probability mass shifts from the second (set of) possible worlds to the
first.

So sampling a random nonblack object which turns out to be a nonraven,
increases p(~raven|~black) but leaves p(black|raven) constant, because,
*unlike* the statements "All ravens are black" and "All non-black
objects are not ravens", the two conditional probabilities are not
logically equivalent, nor do they always change in lockstep.

At the start of your problem, you left the bounds of Hempel's
confirmation paradox entirely.

5)

> "To see why we have increased our estimate of P(non-raven|non-black), we need to look at the three other possible states of the universe, ignored above:"

?? Why did you ignore them?

6)

> "P(black|raven)
>
> is unchanged via the process of observing a random white entity and finding it to be a raven."

I think you mean "non-raven", but anyway...

In several of your possible cases, there are no nonblack objects. In
this case the statement "All non-black objects are not ravens" is
vacuously true. But the conditional probability p(~raven|~black) is
undefined in standard Bayes, involving a literal division by zero -
after which anything can happen, as in the classic proof that 1=2. So
if you are trying to see what happens to p(~raven|~black), you had
better specify that at least one nonblack object exists.

Actually, I also need to specify that at least one nonblack object is
known to exist in every possible world; along with the requirement that,
in at least one possible world containing nonblack ravens, the ratio of
these nonblack ravens to all other nonblack objects does not approach
zero; and the requirement that the proposition "All ravens are black"
not initially have prior probability equal to zero; in order for my
general conclusion to hold that randomly sampling a nonblack object and
finding it to be a nonraven ALWAYS increases the probability assigned to
the proposition "All ravens are black." (Assuming I haven't missed any
other necessary assumptions.)

If we allow for some possible worlds to contain no nonblack objects,
then the procedure of randomly sampling a nonblack object has a third
possibility besides "Raven" and "Non-Raven" which is "Empty". We then
have to take into account the likelihoods assigned to this third
possibility, which changes everything. For example, states 1 and 5
assign probability 1 to the result "Empty". If we assigned most of our
prior probability mass to state 1 or state 5, then the sample coming up
with a nonblack object at all, instead of "Empty", could drastically
decrease the total probability assigned to p(black|raven).

7)

> "There are seven distinguishable states for the interior of the bag, each of which one may assign a certain prior probability."

But which prior probability?

As seen above, the prior probabilities assigned to states 1-7, may
drastically change the effect of sampling a nonblack object and finding
it to be a nonraven.

One of the major benefits of training in probability theory a la Jaynes
is that you learn to stop sweeping critically necessary assumptions
under the carpet of "no information". If you have no information, sir,
do please tell us exactly what information you do not have.

> I wouldn't every say something like "probability theory is wrong" -- it's a
> branch of math and is correct assuming its axiom systems, just like any
> other branch of math.... It's even a *very useful* branch of math which is
> why Novamente is substantially based upon it....

Ben Goertzel wrote on September 9th, 2005:

> If probability theory as standardly deployed states that an observation
> of a non-black non-raven provides a NON-ZERO amount of evidence toward
> the hypothesis that all ravens are black, then this shows there is
> something wrong with probability theory as standardly deployed.
>
> Of cousre, an approach that yields small errors may still be valuable
> for practical AI purposes.
>
> However, what frustrates me about the quote you cite, and your attitude,
> is that you seem to be denying that probability theory as standardly
> deployed is conceptually and logically erroneous in this case -- albeit
> the magnitude of its error is generally small.

I suppose the "as standardly deployed" leaves you an out. So, if you
like, I amend my request: Ben, stop dissing Bayesian probability theory
"as standardly deployed".

-- 
Eliezer S. Yudkowsky                          http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT