RE: friendly ai

From: Ben Goertzel (ben@webmind.com)
Date: Sun Jan 28 2001 - 13:38:26 MST


> I can't visualize an AI incapable of learning making it out of the lab or
> even walking across the room, much less doing one darn thing towards
> bringing citizenship rights to the Solar System.

No, I was unclear. Honest Annie was an AI that got so smart that it just
shut up and stopped communicating with people altogether.... Radio silence.

> Remember, the hypothesis is that Friendliness is the top layer of a good
> design, and discovering and creation the subgoals; if you postulate an AI
> that violates this rule and see horrifying consequences, it should
> probably be taken as an argument in favor of _Friendly AI_. <grin>

I don't think I'm envisioning horrifying consequences at all. AI's getting
bored
with humans isn't all that horrifying, is it? Especially if humans are all
uploading
themselves... then, most humans are going to be bored with old-style flesher
humans
too...

> If the system isn't smart enough to see the massive importance of
> learning, use a programmer intervention to add the fact to the system that
> "Ben Goertzel says learning is massively important". If the system
> assumes that "Ben Goertzel says X" translates to "as a default, X has a
> high probability of being true", and a prehuman AI should make this
> assumption (probably due to another programmer intervention), then this
> should raise the weight of the learning subgoal.

Yeah, this is basically what we've done by explicitly making learning
a system goal

> Speaking as one of the six billion people who gets toasted if you make one
> little mistake, would you *please* consider adding Friendliness to that
> list? I really don't think it will cost you anything.

Friendliness is indeed one of webmind's goals ;>

> I've evolved from subgoal-driven to supergoal-driven over time. I can see
> this as possible, but I really can't see it as inevitable, not if the AI
> is on guard and doesn't want it to happen. Evolution has to happen in
> steps, and steps can be observed, detected, and counteracted. A failure
> of Friendliness in a seed AI vanishes as soon as the AI realizes it's a
> failure; it takes a catastrophic failure of Friendliness, something that
> makes the AI stop *wanting* to be Friendly, before errors can build up in
> the system.

I don't know. Can't "stopping wanting to be friendly" creep up gradually
too?

Then it's "wanting to want to be friendly" that's supposed to stop this from
happening??

> If there's a society of Friendly AIs, they'll *notice* that new AIs are a
> little bit less Friendly than the originals,

Unless the society slowly drifts away from a human-centric focus ...
gradual culture drift is not exactly unknown...

> AIs who are dreadfully panicked
> about the prospect of drifting away from Friendship because Friendship is
> the only important thing in the world to them...
>

aha! Caught you!

Now you're proposing to make AI's neurotic and mentally unhealthy... to make
them
fear becoming unfriendly

But isn't this a recipe for backlash of some sort?? Fear breeds aggression,
no?

Took took took...

ben



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT