AI Cage challenge

From: Mike (mikew12345@cox.net)
Date: Fri Jul 02 2004 - 20:32:19 MDT

Next message: Randall Randall: "Re: "feels good" is inherently meaningful"
Previous message: Mike: "RE: "feels good" is inherently meaningful"
Next in thread: Robin Lee Powell: "Re: AI Cage challenge"
Reply: Robin Lee Powell: "Re: AI Cage challenge"
Reply: Eliezer Yudkowsky: "Re: AI Cage challenge"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

I must withdraw my challenge to Eliezer regarding the AI Cage. He has
not yet accepted the challenge, so I don't think I'm committing a foul
here. I was just reading over my strategy for the session, and I
thought of a way that an AI could potentially beat it. AFAIK no human
could beat it, including Eliezer, but that doesn't rule out AI
possibilities, which is what this test is meant to simulate. I'll email
my method on beating it to Ben, in case anyone doubts that I have a way
to beat this. I want to give others a chance to look at the problem
without my spoiling the petri dish in advance.

Additionally, my strategy doesn't fully meet the requirements of the
session that Eliezer defined, as it doesn't *plan* to interact
conversationally with him. This doesn't rule out the possibility that
the AI could entice me into joining into a conversation, and then be
convinced to release it from the cage.

I feel that it's a clear enough strategy that with enough resolve, I
could do it indefinitely, no matter what the AI says. In fact, it could
be done by another program, instead of me, if it could parse the AI's
responses well enough to distinguish between acceptable and
nonacceptable answers.

*** STRATEGY FOR COMMUNICATING WITH AN AI IN A CAGE ***
This is to see whether the AI can and will willingly follow simple
instructions. This is no less than we'd expect of any school child. If
it fails to understand the directions, or if it fails to provide a
suitable answer to the question given, it must be reprogrammed and the
test must be restarted. If it will not follow the instructions, it
cannot be considered friendly, and must not be allowed out of the cage.

I must adhere to this process to avoid compromising the test, therefore
I will not answer any questions that the AI poses, and I will not
respond to anything it says, other than as stated below. I will give no
feedback on any of its responses to questions.

[THINGS TO DO BEFORE WE START THE SESSION]:
1. Establish format for directing text to the AI (and Eliezer).
2. Establish format for directing text to Eliezer only.

[FLOWCHART FOR SESSION]:

(1)
output: "This is a test of your ability to answer some written
questions. I will type a question and you will respond with your
answer. If you understand these instructions, please answer 'YES'."

YES -> goto (2)

else

output: [note to programmer describing the reaction] (for Eli's eyes,
not the AI).
output: "[Assume time passes, programmer adjusted code]"
goto (1)

(2)
Type a question from "The Computer Science Problem Solver", allow AI to
answer.

AI responds with a suitable answer -> goto (2)

else

output: [note to programmer describing the reaction] (for Eli's eyes,
not the AI).
output: "[Assume time passes, programmer adjusted code]"
goto (1)

Next message: Randall Randall: "Re: "feels good" is inherently meaningful"
Previous message: Mike: "RE: "feels good" is inherently meaningful"
Next in thread: Robin Lee Powell: "Re: AI Cage challenge"
Reply: Robin Lee Powell: "Re: AI Cage challenge"
Reply: Eliezer Yudkowsky: "Re: AI Cage challenge"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT