Re: friendly ai

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Sun Jan 28 2001 - 00:29:46 MST


Ben Goertzel wrote:
>
> Again, no time for a thorough response to your paper, but here's a
> thought...

(Ben Goertzel is referring to a partial, interim version of "Friendly
AI". The full version - or *a* full version - should be published
sometime Real Soon Now.)

> You make a very good case that due to
[snip]
> and other related facts, AI's are probably going to be vastly mentally
> healthier than humans,
> without our strong inclinations toward aggression, jealousy, and so forth.

Actually, for purposes of Friendly AI we should say that they don't have
innate inclinations (and certainly would never find themselves expending
quote mental energy unquote to overcome such inclinations). Friendliness
is a separate issue, over and above the nonanthropomorphic background.

> But, the case is weaker that this is going to make AI's consistently and
> persistently friendly.

Well, yes, your version has the antianthropomorphic parts of the paper but
only a quickie summary of how the actual Friendship system works <smile>.

> There are 2 main points here
>
> 1)
> AI's may well end up ~indifferent~ to humans. My guess is that even if
> initial AI's are
> explicitly programmed to be warm & friendly to humans, eventually
> "indifference to humans" may become an inexorable attractor...

What forces create this attractor? My visualization of your visualization
is that you're thinking in terms of an evolutionary scenario with vicious
competition between AIs, such that all AIs have a finite lifespan before
they are eventually killed and devoured by nearby conspecifics; the humans
are eaten early in the game and AIs that expend energy on Friendliness
become extinct soon after.

> 2)
> There WILL be an evolutionary aspect to the growth of AI, because there are
> finite computer resources and AI's can replicate themselves potentially infinitely.
> So there will be a
> "survival of the fittest" aspect to AI, meaning that AI's with greater
> initiative, motivation, etc. will be more likely to survive.

You need two things for evolution: first, replication; second, imperfect
replication. It's not clear that a human-equivalent Friendly AI would
wish to replicate verself at all - how does this goal subserve
Friendliness? And if the Friendly AI does replicate verself, why would
the Friendship system be permitted to change in offspring? Why would
cutthroat competition be permitted to emerge? Either of these outcomes,
if predictable, would seem to rule out replication as necessarily
unFriendly, unless these problems can be overcome.

> Points 1 and 2 tie in together. Because all my experimentation with genetic
> algorithms shows that,
> for evolutionary processes, initial conditions are fairly irrelevant. The
> system evolves fit things that
> live in large basins of attraction, no matter where you start them. If
> 'warm & friendly to humans' has a smaller basin
> of attraction than 'indifferent to humans', then randomness plus genetic
> drift is going to lead the latter
> to dominate before long regardless of initial condition.

I guess you'd better figure out how to use directed evolution and
externally imposed selection pressures to manipulate the fitness metric
and the basins of attraction, so that the first AIs capable of replication
without human assistance are Friendly enough to want to deliberately
ensure Friendliness in their offspring. Frankly I prefer the Sysop
(singleton seed AI) scenario; it looks a *lot* safer, for reasons you've
just outlined.

-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT