From: Tim Freeman (firstname.lastname@example.org)
Date: Sat Oct 24 2009 - 09:51:12 MDT
On Wed, Oct 21, 2009 at 9:59 AM, Tim Freeman <email@example.com> wrote:
> The PDF file at http://www.fungible.com/respect/talk-mar-2009.pdf is
> <1MB. A real definition would include training data that would
> probably be a few GB's. Start reading at
I keep quoting that so it's clear which proposed AI we're talking
By the way, I've been interpreting "an FAI solution" to mean a
specification of what an FAI would do. This doesn't include a
practical impelementation, so an FAI solution may not include an AI
solution. A practical AI solution might be huge, and I don't know
what it would take. My apologies to everyone if we've been talking
about different problems.
From: Thomas McCabe <firstname.lastname@example.org>
>Suppose I'm in a burning building, and my
>legs are crushed beneath a one-ton iron bar. My desires are, by the
>standards of such things, reasonably simple: I do not want to die a
>horrible, fiery death. My actions are to attempt to lift the bar off
>of me, but I cannot lift something which is more than ten times my
>weight, and so, without intervention, I *will* die a horrible, fiery
>death. If an FAI tried to mimic my actions, it would exert an upward
>force on the bar which is grossly insufficient to actually get it off
>of me, and I would *still* die a horrible, fiery death.
It's good to have concrete examples. Thanks for posting that.
However, I get a different result when I work through the scenario.
The proposed AI makes one model of what all humans want and believe,
so the fact that everyone else has consistently taken action to avoid
horrible fiery death when possible makes it much easier for the AI to
believe that you also want to avoid horrible fiery death. Otherwise
the AI has to think you're someone special, and the added code that
says "Thomas McCabe, unlike everyone else, wants to die a fiery death"
makes those explanations of human motivation require more code and
therefore have less a-priori probability than the ones where you, like
everyone else, don't want a fiery death. Thus the AI could decide to
lift the iron bar off of you without having to pay attention to your
behavior at all.
The AI might get the right answer if even you were the only person it
knows about and the AI hasn't observed enough past to see that you
generally prefer to avoid fiery death. It infers your beliefs and
your motivation simultneously. You attempted to lift the bar and
failed. The AI needs to explain that. Perhaps you thought the bar
was lighter than it really was (belief), and you wanted to lift the
bar and escape from the fire (motivation). If that explanation seems
more likely than the competing explanations, the AI would then try to
give you what you wanted (it would lift the bar, put out the fire, or
If, on the other hand, the AI has observed you to repeatedly attempt
and fail to lift large objects in the past, along with limited joyful
experiments with self-immolation, then the AI would be liekely to
assume that you are doing what you want to do and leave you to your
pastimes as you burn alive.
The AI disregards your actual beliefs, BTW. This leads to an
interesting experiment that I haven't done yet. Suppose the AI
determines that Christianity is unlikely; the AI is, pragmatically, an
Athiest. If this AI encountered a fervent Christian who wanted to get
to Heaven, it would try to arrange for this person to get what he
would want if he had the same beliefs (athiesm) as the AI. I don't
know what these people would want if athiesm were true.
Here's the experiment: somebody please go find a friendly devout
Christian and ask them:
If God appeared before you and said "I am entitled to change the
rules, and I am doing so now. I quit. Now you are your own. You
no longer have a soul -- your thoughts will henceforth be an
ordinary physical consequence of neurons firing in in your brain,
and if the information there is lost, you are gone. There is no
Heaven or Hell" along with other tenants of athiesm, and then God
kept his word and vanished, and furthermore you believed Him, what
would you want in that situation?
If they would want bizarre self-destructive things, we have a problem:
the AI might easily decide to murder all Christians. If they would
want the same things that other athiests want (decent food, health,
family, monster trucks, whatever) then at this proposal doesn't have
that specific bug.
-- Tim Freeman http://www.fungible.com email@example.com
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:05 MDT