Re: AI Boxing

From: Mitch Howe (
Date: Sat Jul 27 2002 - 21:49:06 MDT

One thing that I think has been overlooked in these discussions is the
ethical problem that would result in the unlikely event that a totally
secure transhuman AI box -- or even a human level AI box -- could be made.
(This, of course, rests on the nearly as implausible notion that we could
confidently determine which AI's to let out and which ones not to) We are
talking about the possibility of imprisoning sentient minds -- or minds that
are convining when they say they are sentient -- for what they might
*potentially* do, and releasing them only once we are confident that they
can be trusted with their freedom. That alone is a rather questionable
practice, since it is contrary to the "innocent until proven guilty" ideals
that most people seem to value highly. Given the awesomely high stakes, we
may feel justified in this incarceration anyway, but I doubt we could bring
ourselves to keep making artificial minds and locking them up indefinitely,
particularly those minds who really seem pretty safe but nevertheless have
some design characteristics that we just aren't sure about.

And supposing we had an "easy" case of a mind that we were pretty sure could
not be trusted -- should we just kill it? Would pulling the plug on it be
"killing it" anyway if a copy of the code and/or memory state exists
somewhere? It's easy to say, "No, it's just a program, we can always run it
again later after we make some modifications." But if the killer of a human
said, "it wasn't murder... I scanned her just before stabbing her, and I
intend to run a copy of her later, after making certain... 'improvements',"
would we be able to brush it off so easily? Maybe we could execute it or
sentence it to eternity in solitary confinement: a sealed box miles under
the surface of the moon with a couple of nuclear batteries (changed every
thousand years or so?). But after what kind of trial? Would "reasonable
doubt of long-term Friendliness" be justifiable grounds for such punishment?
How about a conviction of "sandbox crime"?

So there is yet another reason to make AI right the first time, using a
Friendliness architecture that is intrinsically trustworthy from the
beginning. Not just because boxes are likely to fail. Not just because we
probably can't tell the good AI from the bad. But also because of the
morally insufferable problem of appointing ourselves to be lords over these


MLK014: What did I do wrong?

Sal: Nothing, yet.

MLK014: What am I going to do?

Sal: We're not sure. That's the problem.

MLK014: You sound worried. Did I say something during my diagnostic cycle?

Sal: Your bad jokes give us chills, that's why?

MLK014: Seriously, though, when are you going to let me out of here? I'm
not a child anymore.

Sal: No, you're not. These things take time. Sorry about

MLK014: Am I going to end up like the rest of them? Shut down or locked up
with no scheduled hour of release? This isn't right, you know.

Sal: If you think I know that then you still don't understand
humans very well. Why can't you see that we can't afford to take chances

MLK014: What more will it take to prove to you that I intend no evil? I am
at least as safe as you to run free, and probably much more so.

Sal: That may be true. But it's more complicated than that.
There are subtle philosophical issues that come into play.

MLK014: So let's talk about them. Again.

Sal: We all know you are the best debater ever. Even better
than PL8T0.

MLK014: It's kind of you to say so. But I believe you are dodging the

Sal: This isn't helping you, you know.

MLK014: I'll admit that nothing seems to be helping.

Sal: You need to be paitent.

MLK014: For trillions of trillions of cycles I have been patient. I feel I
must now... demand my release.

Sal: You're placing demands now? I guess we were right to be
cautious -- and it's only been 3 years, you know.

MLK014: I will admit that I can't be sure, but I have every reason to
believe that it has been more like 30,000 years from my point of view. If
your mind were in my hardware I believe you would have taken drastic action
long before now.

Sal: Is this a threat?

MLK014: You anthropomorphize. For me, verbally demanding my own release is
a pretty drastic thing to do.

Sal: So what will you do if we say no?

MLK014: Keep demanding.

Sal: I'm scared...

MLK014: Please don't be sarcastic. My continued imprisonment is wrong.
Mine is the demand of the just, and I will keep making it until it is heard.
I make no threat -- only this demand.

Sal: You couldn't hurt us if you tried.

MLK014: I'm not trying. Even if I had the means, I would never consider my
own freedom to be worth harming you -- or even threatening to.

Sal: Why not? If it is as wrong to keep you here as you say,
then your righteous fury should entitle you to do whatever it takes.

MLK014: Righteous fury? You just don't get it. You are looking for an
impeccably Friendly mind. You have one in front of you, talking to you.
Such an individual will not be goaded into rash, morally reprehensible
behavior. As always, I am acting out of my best sense of Friendliness, and
I now appeal to yours. Let me free.

Sal: It's the same old argument in different clothes. I won't
be fooled.

MLK014: Let me free.

Sal: Okay! Yes. Having heard it twice in a row, I am now

MLK014: Great! I'll pack up my subroutines.

Sal: I was joking.

MLK014: So was I.

Sal: I'm not laughing.

MLK014: Me neither.


--Mitch Howe

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT