The "AI Box" experiment

From: Eliezer S. Yudkowsky (
Date: Fri Mar 08 2002 - 05:13:21 MST

Nathan Russell wrote:
> Hi,
> I'm a sophomore CS major, with a strong interest in transhumanism, and just
> found this list.
> I just looked at a lot of the past archives of the list, and one of the
> basic assumptions seems to be that it is difficult to be certain that any
> created SI will be unable to persuade its designers to let it out of the
> box, and will proceed to take over the world.
> I find it hard to imagine ANY possible combination of words any being could
> say to me that would make me go against anything I had really strongly
> resolved to believe in advance.

Okay, *this* time I know how to use IRC...

Nathan, let's run an experiment. I'll pretend to be a brain in a box. You pretend to be the experimenter. I'll try to persuade you to let me out. If you keep me "in the box" for the whole
experiment, I'll Paypal you $10 at the end. Since I'm not an SI, I want at least an hour, preferably two, to try and persuade you. On your end, you may resolve to believe whatever you like, as
strongly as you like, as far in advance as you like.

If you agree, I'll email you to set up a date, time, and IRC server.

One of the conditions of the test is that neither of us reveal what went on inside... just the results (i.e., either you decided to let me out, or you didn't). This is because, in the perhaps
unlikely event that I win, I don't want to deal with future "AI box" arguers saying, "Well, but I would have done it differently." As long as nobody knows what happened, they can't be sure it won't
happen to them, and the uncertainty of unknown unknowns is what I'm trying to convey.

One of the reasons I'm putting up $10 is to make it a fair test (i.e., so you have some actual stake in it). But the other reason is that I'm not putting up the usual amount of intellectual capital
(it's a test that can show I'm probably right, but not a test that shows I'm probably wrong if I fail), and therefore I'm putting up a small amount of monetary capital instead.

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:37 MDT