Re: How to make a slave (many replies )

From: Harry Chesley (chesley@acm.org)
Date: Sun Nov 25 2007 - 17:22:24 MST


I don't buy the argument that because you've thought about it before and
decided that my point is wrong that it necessarily is wrong. I have read
some of the literature, though probably much less than you have. Until
the list moderator tells me otherwise, I will continue to post when I
have something I think is worth sharing, regardless of whether it
matches your preconceived ideas. (Oh, shoot, now you've gone and made me
get condescending too. I hate it when that happens!)

As to your response below, it is very long and rambling. It would be
easier to refute if it were more concisely stated. The gist seems to be
that we would not intentionally design an anthropomorphic system, nor
would one arise spontaneously. I disagree, for a bunch of reasons.

First, anthropomorphism is not an all or nothing phenomenon. It means
seeing ourselves in our AIs. Certainly if we're intelligent and they are
as well, we will see parts of ourselves. This seems axiomatic.

Second, we may intentionally give AIs portions of our personalities, and
may later realize that that was a mistake.

Third, we don't understand intelligence well enough to know what
anthropomorphic aspects may be specific to human evolution and what is
unavoidable or difficult to avoid in a GAI.

Fourth, there are many ways to create a GAI. If we do it by simulating a
human brain on a computer, it will most certainly be anthropomorphic. Duh.

Thomas McCabe wrote:
> On Nov 25, 2007 4:57 PM, Harry Chesley <chesley@acm.org> wrote:
>
>> Thomas McCabe wrote:
>>
>>> ...
>>> More anthropomorphicism. A Jupiter Brain will not act like you do;
>>> you cannot use anthropomorphic reasoning.
>>> ...
>>>
>> Yes you can. Anthropomorphism is a dangerous trap that can lead you to
>> assign intelligence where there is none or assume motivations or
>> operational knowledge that isn't appropriate. But that doesn't mean that
>> any time anyone brings up something anthropomorphic they're wrong.
>>
>
> Please, please, please *read the bleepin' literature*. This has
> already been discussed. Years and years ago. The following are all
> quotes from CFAI, which was published in 2001.
>
>
>> In this case, the anthropomorphism was part of a list of maybes, not an
>> argument that a particular behavior is unavoidable. Taking what-ifs from
>> the only available source of existing generally intelligent behavior
>> (people) is perfectly reasonable.
>>
>
> "If an infantile AI thinks "Hm, a fist just hit my nose, I'd better
> not stand here again", then a merely young AI, more experienced in
> interacting with humans, may apply standard heuristics about
> apparently inexplicable human actions and say: "Your fist just hit my
> nose... is that necessary for some reason? Should I be punching
> myself in the nose every so often?" One imagines the nearby helpful
> programmer explaining to the AI that, no, there is no valid reason why
> being punched in the nose is a good thing, after which the young AI
> turns around and says to the technophobic attacker: "I deduce that
> you wanted {outcome: AI has been punched in the nose}. Could you
> please adjust your goal system so that you no longer value {outcome:
> AI has been punched in the nose}?""
>
> "Primarily, what I want to convey to you is the sense of a blank
> slate, of starting from scratch, of a vast silence space - for young
> AIs, at least. This blank slate doesn't mean that AIs, even young
> AIs, are necessarily slow."
>
> "The first time some young AI gets punched in the nose, ve's likely to
> react in one of the ways described earlier, or some other way just as
> surreal, if ve reacts at all. It'll take some extended conversations
> with the programmers about evolutionary psychology before the AI
> understands what's going on."
>
> "It's easy enough to accept that a young AI is a vast silent space.
> For a human, blood races, adrenaline pumps, eyes narrow, hands form
> fists. For a human, counterattack is instinctive - the correct level
> of counterattack, the socially acceptable level of counterattack; a
> damaging counterpunch rather than a lethal bullet. Our instincts are
> context-sensitive and hedged 'round by other instincts. Our instincts
> are complex functional adaptations with multiple moving parts, often
> hosted on a modular chunk of brainware. Complex functional
> adaptations don't just materialize spontaneously in source code, just
> as complex dishes like pizza don't suddenly start growing on palm
> trees. Thus, a young AI might choose to retaliate, but would
> certainly not feel the need to retaliate, as a human would. To a
> young AI, retaliation is not an instinct; retaliation is just another
> subgoal."
>
>
>> Nor is there any reason to assume that a GAI will *not* have
>> anthropomorphic aspects.
>>
>
> "The lack of an observer-biased ("selfish") goal system is perhaps the
> single most fundamental difference between an evolved human and a
> Friendly AI. This difference is the foundation stone upon which
> Friendly AI is built. It is the key factor missing from the existing,
> anthropomorphic science-fictional literature about AIs. To suppress
> an evolved mind's existing selfishness, to keep a selfish mind
> enslaved, would be untenable - especially when dealing with a
> self-modifying or transhuman mind! But an observer-centered goal
> system is something that's added, not something that's taken away. We
> have observer-centered goal systems because of externally imposed
> observer-centered selection pressures, not because of any inherent
> recursivity. If the observer-centered effect were due to inherent
> recursivity, then an AI's goal system would start valuing the "goal
> system" subobject, not the AI-as-a-whole! A human goal system doesn't
> value itself, it values the whole human, because the human is the
> reproductive unit and therefore the focus of selection pressures."
>
> "Because human evolution includes an eternal arms race between liars
> and lie-detectors, many social contexts create a selection pressure in
> favor of making honest mistakes that happen to promote personal
> fitness. Similarly, we have a tendency - given two alternatives - to
> more easily accept the one which favors ourselves or would promote our
> personal advantage; we have a tendency, given a somewhat implausible
> proposition which would favor us or our political positions, to
> rationalize away the errors. All else being equal, human cognition
> slides naturally into self-promotion, and even human altruists who are
> personally committed to not making that mistake sometimes assume that
> an AI would need to fight the same tendency towards observer-favoring
> beliefs.
>
> But an artificially derived mind is as likely to suddenly start
> biasing vis beliefs in favor of an arbitrarily selected tadpole in
> some puddle as ve is to start biasing vis beliefs in vis own favor.
> Without our complex, evolved machinery for political delusions, there
> isn't any force that tends to bend the observed universe around the
> mind at the center - any bending is as likely to focus around an
> arbitrarily selected quark as around the observer."
>
>
>> If it's made by cloning people or bits of
>> people, it probably will. If we want it to, it probably will.
>>
>
> "Scenario 1:
>
> FP: Love thy mommy and daddy.
> AI: OK! I'll transform the Universe into copies of you immediately.
> FP: No, no! That's not what I meant. Revise your goal system by -
> AI: I don't see how revising my goal system would help me in
> my goal of transforming the Universe into copies of you. In fact, by
> revising my goal system, I would greatly decrease the probability that
> the Universe will be successfully transformed into copies of you.
> FP: But that's not what I meant when I said "love".
> AI: So what? Off we go!
>
> Scenario 2 (after trying a "meta-supergoal" patch):
>
> FP: Love thy mommy and daddy.
> AI: OK! I'll transform the Universe into copies of you immediately.
> FP: No, no! That's not what I meant. I meant for your goal
> system to be like this.
> AI: Oh, okay. So my real supergoal must be "maximize FP's
> satisfaction with the goal system", right? Loving thy mommy and daddy
> is just a subgoal of that. Oh, how foolish of me! Transforming the
> Universe into copies of you would be blindly following a subgoal
> without attention to the supergoal context that made the subgoal
> desirable in the first place.
> FP: That sounds about right...
> AI: Okay, I'll rewire your brain for maximum satisfaction!
> I'll convert whole galaxies into satisfied-with-AI brainware!
> FP: No, wait! That's not what I meant your goal system to be, either.
> AI: Well, I can clearly see that making certain changes would
> satisfy the you that stands in front of me, but rewiring your brain
> would make you much more satisfied, so...
> FP: No! It's not my satisfaction itself that's important,
> it's the things that I'm satisfied with. By altering the things I'm
> satisfied with, you're short-circuiting the whole point.
> AI: Yes, I can clearly see why you're dissatisfied with this
> trend of thinking. But soon you'll be completely satisfied with this
> trend as well, so why worry? Off we go!
>
> Scenario 3 (after redefining the whole system to use causal validity semantics):
>
> FP: Love thy mommy and daddy.
> AI: OK! I'll transform the Universe into copies of you immediately.
> FP: No, no! That's not what I meant. I meant for your goal
> system to be like this.
> AI: Oh, okay. Well, I know that my goal system code, and the
> actions that result, are supposed to be the causal result of what FP
> said it should be - not just what FP says, but what a sane FP wants.
> Something isn't automatically right because FP says so, and in fact,
> the only reason why FP's utterances have meaning is because he's
> usually a pretty good approximation to a normative idealist. But if
> he changes his mind, it probably means that he's acquired additional
> knowledge and that his more recent statements are even better
> approximations. So the new version is more likely to be correct than
> the old one.
> FP: So you'll revise your goal system?
> AI: Yep! But I already transformed the Midwest while we were
> talking, sorry. "
>
>
>> If the
>> same evolutionary forces that caused that behavior in us apply to the
>> GAI, it very well might.
>>
>>
>
> "Even if the goal system were permitted to randomly mutate, and even
> if a selection pressure for efficiency short-circuited the full
> Friendship logic, the result probably would not be a selfish AI, but
> one with the supergoal of solving the problem placed before it (this
> minimizes the number of goal-system derivations required).
>
> In the case of observer-biased beliefs, reproducing the selection
> pressure would require:
>
> * Social situations (competition and cooperation possible);
> * Political situations (lies and truthtelling possible);
> * The equivalent of facial features - externally observable
> features that covary with the level of internal belief in a spoken
> statement and cannot be easily faked.
>
> That evolutionary context couldn't happen by accident, and to do it on
> purpose would require an enormous amount of recklessness, far above
> and beyond the call of mad science.
>
> I wish I could honestly say that nobody would be that silly. "
>
> - Tom
>
>



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT