Re: Volitional Morality and Action Judgement

From: Eliezer Yudkowsky (sentience@pobox.com)
Date: Sun May 23 2004 - 22:11:50 MDT


Ben Goertzel wrote:
>
> We've had this discussion before, but I can't help pointing out once
> more: We do NOT know enough about self-modifying AI systems to estimate
> accurately that there's a "zero chance of accidental success" in
> building an FAI. Do you have a new proof of this that you'd like to
> share? Or just the old hand-wavy attempts at arguments? ;-)

Ben? Put yourself in my shoes for a moment and ask yourself the question:
  "How do I prove to a medieval alchemist that there is no way to concoct
an immortality serum by mixing random chemicals together?" Bear in mind
that if the medieval alchemist swallows something lethal, you die and so
does the rest of the human species. Bear in mind also that the medieval
alchemist takes refuge in his ignorance for it lets him continue hoping,
the same way people take refuge in the ignorance that lets them believe in
creationism. And it is so easy, if you are a medieval alchemist, to say
that you don't know; for indeed you don't. It sounds so humble and
scientific, to admit one's ignorance, on those occasions when it is
politically convenient to be ignorant. The more ignorance the better, if
it lets you avoid changing your behaviors.

Our medieval alchemist died in the moment when he decided to be proud of
not knowing.

I know many specific problems that are necessary but not sufficient to FAI,
all of which, supposedly, will be solved by accident. There will be an
intrinsic (if humanly uncomputable) minimum complexity, and the chance of
this complexity materializing from nowhere will be 2^-Kolmogorov(FAI),
probably far less likely than your lottery ticket winning (but no one can
PROVE your numbers won't win; indeed, no one can PROVE the Sun will rise
tomorrow, and therefore all beliefs have equivalent credibility and you can
believe whatever you want). I can take proposals for AI morality and shoot
them down one by one - including CFAI, by the way. And every time, the
six-year-old with the chemistry set says, "Oh, but I would never do *that*,
I'm not stupid, I'd do this instead," and proposes another lethal thing. I
can point out how the proposed system appears to work just fine, just as
planned, providing plenty of positive reinforcement and wonderful
discoveries to the happy programmers, so long as the AI can't rewrite its
own code or is subject to threat of enforcement by the programmers. Which
system will then tile the solar system with tiny smiley faces as soon as
the AI begins operating in a new domain. And even after people acknowledge
the silent kill - oh, pardon me, the *possibility* of the silent kill -
they will go on talking about "working it out by experiment", which also
sounds very scientific except that the negative experimental result is a
thud followed by six billion other thuds.

No, I suppose it's not a "zero" chance, but there are laws of rational
reasoning that apply to such problems. When you specify a complex outcome,
you measure the amount of complex information in that outcome, and that
tells you the improbability of the specific complex miracle needed to
produce that outcome by accident, unless someone else comes up with a short
computer program showing that the Kolmogorov complexity is less than it
appears. And for basic information-theoretical reasons, one may not
*expect* or *hope* for a short computer program in an arbitrary case. But
I cannot PROVE that there is no Turing machine with less than 64 states
that outputs a Friendly AI design in Python with comments in Quenya poetry,
save by exploring the complete space of Turing machines with less than 64
states.

I am reminded of the story (presented as true; I have not verified it) that
when the Australian government decided to start a national lottery, the
reporters interviewed some guy on the street and asked him if he was going
to buy lottery tickets. "Yes," he said. "What do you think are your
chances of winning?" the reporters asked. "Fifty-fifty," he said, "either
I win or I don't." The moral being that you cannot apply the Principle of
Indifference to surface outcomes; you need to apply it to exchangeable
instances of the set of lottery ball sequences. Being completely ignorant
of something, as I am completely ignorant of the results of turning a
six-year-old loose in a biochemistry lab, or as I am completely ignorant of
the results of throwing together an AI at random, does not mean that the
chance of any surface description holding true is fifty-fifty. There is a
mathematics of complete ignorance that does not allow you to hope for
specific complex outcomes. But of course people *do* take refuge in their
complete ignorance of the lottery numbers, and say, "But you can't prove my
numbers won't win."

-- 
Eliezer S. Yudkowsky                          http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT