Re: AI Boxing:

From: Stuart Armstrong (
Date: Wed Jun 04 2008 - 04:13:53 MDT

> act, it might be necessary to have certain minimal information about
> the person to determine what that sequence of words is. A
> superintelligent being won't necessarily be able to deduce from your
> shoe size what the name of your pet cat is, or what words would make
> you commit murder.

> letting it out. As it's so easy to see this, include giving AI the
> power in the list of things gatekeeper shouldn't do (not that a
> particular list of errors will do any good).

So we are in agreement: if we have an AI which is kept in total
ignorance of the world, and can't help us at all, then we have a
chance of keeping it in the box. Alternatively, we could just have an
empty box.

An oracle AI would be useless unless it were informed about the world.
And it is providing a service for us (the theory of friendly AI) which
will grant it great power over us, even after its deletion. The oracle
AI will only be good (in the form discussed here) if the theory of
Friendliness it comes up with is simple enough that we can go through
it and check there are no holes.

By personal prejudice is that no such simple theory exists - mostly
because the definition of friendliness depends on the level of power
and intelligence of the AI, on the evolution of external
circumstances, and is not a simple universal.

That's not to say an unfriendly Oracle AI would be useless - it would
be great for specific, narrow, factual issues (one where the correct
answer is unambiguous in retrospect). My recommendation is, of course,
that we ensure the trustworthiness of an Oracle AI by a recursive
chain of AI's of diminishing intelligence - but that's my particular


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT