RE: Friendly AI in "Positive Transcension"

From: Ben Goertzel (ben@goertzel.org)
Date: Mon Feb 16 2004 - 09:05:08 MST

Next message: Christian Szegedy: "Re: Positive Transcension"
Previous message: Mikko Särelä: "RE: Encouraging a Positive Transcension"
In reply to: Eliezer S. Yudkowsky: "Re: Friendly AI in "Positive Transcension""
Next in thread: Jef Allbright: "Positive Transcension"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Eliezer,

So far I've been very conciliatory in this dialogue, but I feel the need to
be a little more defensive of my position now. I really don't think my
descriptions of your views were THAT far off from what you said in CFAI. I
may not have read CFAI for a couple years, but I didn't read it *that*
sloppily.

In CFAI you wrote, as a characterization of Friendly AI

"All that is required is that the initial shaper network of the Friendly AI
converge to normative altruism. "

You also wrote a lot about programmers teaching early-stage AI's to be
Friendly.

You also noted that you believe the hard takeoff to be the most likely
possibility, so you assume it as a default case in your discussion.

Then, you complain when, in my essay, I say you envision a hard takeoff...
(but this is clearly stated in CFAI!)

And you complain when I say that you propose AI's that will be taught to be
"benevolent." To me, "benevolent" is not so far off from "normatively
altruistic".... And teaching AI's to be benevolent is not so far off from
teaching them in a way intended to cause them to converge to a condition of
benevolence/normative-altruism.

It still seems to me that my summary of your views is NOT so far off from
what you explicitly say in CFAI.

The difference seems to be that I think of "normative altruism" as being a
specific ethical rule or ethical system, whereas you are thinking of
"normative altruism" as being something much different and more cosmic...

In practical terms, about my essay... either tonight or later this week,
I'll revise my essay in a way that should make it inoffensive to you, at
least in terms of its mention of your work.

I'll

* mention your theory of Friendly AI, and a few points from these email
discussions, and then

* introduce a theory called "Benevolent AI", which I'll describe as loosely
related to but simpler than your ideas. Most of the discussion currently
framed in my essay as being about Friendly AI will be rephrased as being
about "Benevolent AI."

> I don't, of course. However, I should hope that you would be aware of
> your own lack of understanding and/or inability to compactly summarize,
> and avoid giving summaries of FAI that are the diametric opposite of what
> I mean by the term.

This is rather an overstatement, no? The diametric opposite of your theory
would be more like

"Create an AI with the most inhumane moral and goal system possible."

-- or else perhaps something like

"It doesn't matter what the hell we do, all AI's of sufficient intelligence
will do what God tells them to anyway" ;-)

> Presently, you seem to be in a state of feeling that there is nothing to
> understand.

Nah, I definitely get the feeling you've progressed a lot beyond what is
explicitly stated in CFAI, in terms of your own thinking on these subjects.

I got that feeling when we met last in person, as well.

I believe that my summary is a lot closer to what you said in CFAI, than in
what you think currently.... I think that now you interpret CFAI in terms
of your current ideas, even when this interpretation is not the most direct
interpretation of the actual text in CFAI.

> To me these are subcategories of essentially the same kind of content,
> humans thinking in philosophical mode. The commensurability of abstract
> ethical principles, specific ethical systems, and ethical rules, as
> cognitive content, are why we have no trouble in reasoning from verbally
> stated ethical principles to specific ethical systems and so on.
> They are
> produced by human philosophers; they are argued all in the same midnight
> bull sessions; they are written down in books and transmitted
> memetically.
> Children receive them from parents as instructions, understand them as
> social expectations, feel their force using the emotions and
> social models
> of frontal cortex. Principles, systems, and rules are subcategories of
> the same kind of mental thing manipulated in the mind - they've got the
> same representation. Humans argue about principles, systems, and rules
> fairly intuitively - the debate goes back to Plato at the least.
>
> FAI::humaneness is a different *kind of thing* from this. There
> is no way
> to argue FAI::humaneness to someone; it's not cognitive content, it's a
> what-it-does of a set of dynamics.

Hmmm... I'm not sure why "joy" and "growth" and "choice" aren't also in the
category of

"what-it-does of a set of dynamics"

And I'm not sure why "humane-ness" isn't also something to be written about
and discussed.

In fact, aren't you planning on writing something about it at some point int
he future, which will then be discussed?

I'm not asserting there's NO distinction between "humaneness" and these
other principles, but I don't yet see why the distinction is as profound as
you say it is...

> It's not the sort of thing you could
> describe in a philosophy book; you would describe it by pointing to a
> complete FAI system that implements FAI::humaneness.

But aren't choice, growth and joy also far better described via example than
via verbal discussions? Of course they are...

It seems to me that humaneness is a messier, more complex, and less crisp
sorta thing than growth, choice or joy -- yet I don't see why you think it's
of a fundamentally different character. Frankly, each of these other three
principles is also a big nasty, hard-to-define-mess...

I think that growth, joy and choice are three relatively-well-defined
principles that are *important components* of "humaneness". They represent
some of the better components of humaneness in my view -- as opposed to the
also-large components embodying "man's inhumanity to man" ...

> What I wish you to avoid is presenting FAI as if it were a specific
> ethical principle, or anything commensurate with a specific ethical
> principle, because that's the *wrong sort of stuff* to describe either
> FAI::humaneness or FAI architectural principles. It is, in fact,
> diametrically opposed to FAI, which was created to address inherent and
> fundamental problems with trying to imbue specific ethical
> principles into
> an AI - this is what I consider to be the generalized Asimov Law fallacy.

OK, that's a clear statement -- I can revise my mention of your theory to
account for this sentiment on your part

Personally, of course, I disagree that there are fundamental problems with
imbuing abstract ethical principles into an AI.

I think that there are problems with imbuing overly narrow ethical
principles into an AI, which don't necessarily exist if the principles are
abstract and "natural" enough. (Yet grounded in specifics in spite of their
abstraction... with the understanding that the specifics will change over
time but the abstractions will remain relatively constant)

I think it may be easier to put an abstract ethical principle into an AI
than to put a big messy thing like "humaneness" into one, actually...

-- Ben

Next message: Christian Szegedy: "Re: Positive Transcension"
Previous message: Mikko Särelä: "RE: Encouraging a Positive Transcension"
In reply to: Eliezer S. Yudkowsky: "Re: Friendly AI in "Positive Transcension""
Next in thread: Jef Allbright: "Positive Transcension"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:45 MDT