Re: Safety of brain-like AGIs

From: Eliezer S. Yudkowsky (
Date: Thu Mar 01 2007 - 19:01:49 MST

Ben Goertzel wrote:
> Shane Legg wrote:
>> On 3/1/07, *Eliezer S. Yudkowsky* <> wrote:
>>> As I remarked on a previous occasion, for purposes of discussion we
>>> may permit the utility function to equal the integral of iron atoms
>>> over time. If you can't figure out how to embody this utility
>>> function in an AI, you can't do anything more complicated either.
>> I don't see the point in worrying about whether one can integrate
>> iron atoms, indeed this type of thinking concerns me.
> Well, one **could** create an AI system with the top-level supergoal
> G as a "free parameter", so that it could achieve any goal G with
> complexity less than K (according to whatever complexity measure
> seems apropos ... e.g. algorithmic information...) .... I guess that
> is the type of architecture Eliezer is implicitly advocating.

What I was attempting to say is that, for purposes of saying "How can I
disprove/prove that which you have failed to formally specify?", I will
accept a disproof of stable iron-maximizing AI as a disproof of stable
Friendly AI, and a proof of stable iron-maximizing AI would be huge
progress toward a proof of stable Friendly AI. In other words, for
purposes of preliminary discussion, you do not get to say "Friendly is
ill-defined, therefore blah blah blah" because until you can do whatever
proof or disproof you were trying to do with Friendly=<Integral
IronAtomCount(t) dt>, there's no way you can do it for Friendly=<friendly>.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:57 MDT