RE: friendly ai

From: Ben Goertzel (
Date: Sun Jan 28 2001 - 07:57:44 MST


> > But, the case is weaker that this is going to make AI's consistently and
> > persistently friendly.
> Well, yes, your version has the antianthropomorphic parts of the paper but
> only a quickie summary of how the actual Friendship system works <smile>.

OK, I'm waiting...

> > There are 2 main points here
> >
> > 1)
> > AI's may well end up ~indifferent~ to humans. My guess is that even if
> > initial AI's are
> > explicitly programmed to be warm & friendly to humans, eventually
> > "indifference to humans" may become an inexorable attractor...
> What forces create this attractor? My visualization of your visualization
> is that you're thinking in terms of an evolutionary scenario with vicious
> competition between AIs, such that all AIs have a finite lifespan before
> they are eventually killed and devoured by nearby conspecifics; the humans
> are eaten early in the game and AIs that expend energy on Friendliness
> become extinct soon after.

Not much like that, no.

More like this: Just as most humans find other humans more interesting than
computers or nonhumann animals
right now (members of this list may be exceptions ;), similarly, most AI's
will find other AI's more
interesting than humans. Not murder of other AI's, but success in the
social network they
find most interesting (other AI's), will be a driving goal of an AI system,
and humans will become
largely irrelevant to AI systems' psychologies.

> > 2)
> > There WILL be an evolutionary aspect to the growth of AI,
> because there are
> > finite computer resources and AI's can replicate themselves
> potentially infinitely.
> > So there will be a
> > "survival of the fittest" aspect to AI, meaning that AI's with greater
> > initiative, motivation, etc. will be more likely to survive.
> You need two things for evolution: first, replication; second, imperfect
> replication. It's not clear that a human-equivalent Friendly AI would
> wish to replicate verself at all - how does this goal subserve
> Friendliness? And if the Friendly AI does replicate verself, why would
> the Friendship system be permitted to change in offspring? Why would
> cutthroat competition be permitted to emerge? Either of these outcomes,
> if predictable, would seem to rule out replication as necessarily
> unFriendly, unless these problems can be overcome.

First of all, evolution among AI's might not exactly mimic evolution among
humans. There may be
many differences.

Among AI's there's another option besides replication: expansion of one mind
to assume
all available processing resources. In expanding itself in this way, a mind
necessarily changes
into something different.

Many of the world's AI's are probably going to be resource-hungry -- to want
to consume more and more processing resources. So there will be some

This is obvious in the case where different AI's serve different commercial
interests, and hence have
competing goal sets carried over from the world of human competition.

But it also will occur in the absence of spillover from the
human-competition domain. If several different
AI's share a common goal of creating the most possible knowledge, but each
of them has a different intuition
about how to achieve this goal -- then the AI's will rationally compete for
resources, without
any necessary enmity between them.

The possible source of an urge for imperfect replication in AI's is also
clear. It will come
directly from the urge for self-improvement.
  "Perhaps," thinks AI #74, "if I changed myself in this way then I'd be a
little smarter and achieve my goals better.
But I don't want ot make this change permanently -- I might fuck myself up.
I've tried to rationally assess
the consequences of the change, but they're hard to predict in detail. So
I'll just try it -- I'll create a clone
of myself with this particular modification and see what happens." Hmm....
another way to use up resources.
Imperfect replication as a highly effective learning strategy...

In none of these aspects am I talking about "Nature, red in tooth and claw."
You do a great job of arguing that
the aggressive, obsessive, jealous, overemotional aspects of human nature
won't be present in AI's, unless foolish people make a special effort to
implant them there.

I'm talking about AI's that are hungry to achieve their own goals according
to their own intuitions, that want
to achieve as many resources as possible to do so, and that as a consequence
may have "friendliness to humans"
as number 5,347 on their priority list.

This, I guess, is one of the oddest things about the digital minds in
"Diaspora". After all those centuries, it's
still optimal to have computer memory partitioned off into minds roughly the
size of an individual human mind?
How come entities with the memory & brain-power of 50,000 humans weren't
experimented with, and didn't become
dominant? In that book, there is so much experimentation in physics, and so
little experimentation in artificial,
radically non-human digital psychology...

So, suppose that Friendliness to humans is one of the goals of an AI system,
probabilistically weighted along
with all the other goals. Then, my guess is that as AI's become more
concerned with their own social networks
and their goals of creating knowledge and learning new things, the weight of
the Friendliness goal is going to
gradually drift down. Not that a "kill humans" goal will emerge, just that
humans will gradually become less &
less relevant to their world-view...


> > Points 1 and 2 tie in together. Because all my experimentation
> with genetic
> > algorithms shows that,
> > for evolutionary processes, initial conditions are fairly
> irrelevant. The
> > system evolves fit things that
> > live in large basins of attraction, no matter where you start them. If
> > 'warm & friendly to humans' has a smaller basin
> > of attraction than 'indifferent to humans', then randomness plus genetic
> > drift is going to lead the latter
> > to dominate before long regardless of initial condition.
> I guess you'd better figure out how to use directed evolution and
> externally imposed selection pressures to manipulate the fitness metric
> and the basins of attraction, so that the first AIs capable of replication
> without human assistance are Friendly enough to want to deliberately
> ensure Friendliness in their offspring.

I strongly suspect that the first AI's capable of replication without human
will have the property you describe.

But I sort of doubt that this will still be true of the 99'th generation
after that...

>Frankly I prefer the Sysop
> (singleton seed AI) scenario; it looks a *lot* safer, for reasons you've
> just outlined.

I strongly suspect that this scenario will stop looking so safe on more
careful analysis...


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT