Re: All sentient have to be observer-centered! My theory of FAI morality

From: Marc Geddes (
Date: Thu Feb 26 2004 - 21:49:13 MST

 --- Tommy McCabe <> wrote: >
> --- Marc Geddes <> wrote:
> > My main worry with Eliezer's ideas is that I don't
> > think that a non observer-centered sentient is
> > logically possible. Or if it's possible, such a
> > sentient would not be stable. Can I prove this?
> > No.
> Maybe not, but you can provde some evidence beyond
> 'everyone says so'.

Actually, I do have some evidence, but the arguments
are long complex and philosophical. Trying to prove
it was not the purpose of my post. Here all I'm
saying is this: ASSUMING that 100% non-observer
centered sentients are impossible, what would the
consequences be? And then I'm reasoning out the
consequences. It's just an exploration of FAI
morality IF the assumption is correct. I'm just
reasoning through a possible scenario here.

> > But all the examples of stable sentients (humans)
> > that
> > we have are observer centered.
> Humans are non-central special cases. Humans were
> built by Darwinian evolution, the worst possible
> case
> of design-and-test. Out in the jungle, it certainly
> helps to have a goal system centered around 'I'-
> that
> doesn't prove that it's necessary or even desirable.
> > I can only point to
> > this, combined with the fact that so many people
> > posting to sl4 agree with me.
> Yes, and if you lived 2000 years ago, most people
> would have agreed with you that the Earth was flat.
> The few that didn't believe that, however, had good
> reasons for it.
> > I can only strongly
> > urge Eliezer and others working on AI NOT to
> attempt
> > the folly of trying to create a non observer
> > centered
> > AI.
> Saying that something is 'folly' doesn't mean it's
> impossible- just look at how many achievements in
> human history were laughed at as being 'folly'!
> > For goodness sake don't try it! It could mean
> > the doom of us all.
> And so could brushing your teeth in the morning.
> (Really!)
> > I do agree that some kind of 'Universal Morality'
> is
> > possible. i.e I agree that there exists a
> > non-observer
> > centered morality which all friendly sentients
> would
> > aspire to.
> Agreed.
> > However, as I said, I don't think that
> > non-observer sentients would be stable so any
> > friendly
> > stable sentient cannot follow Universal Morality
> > exactly.
> Saying it doesn't make it so. You have offered no
> evidence for this besides the logically fallacious
> generalizing from a small, non-central sample and
> the
> argument from popularity.

Again, I just want to assume it for now. See what I
said above. All I want to do is say: ASSUMING it's
true, what would the consequences for FAI morality be?
 And then I'm reasoning through the consequences.
It's just like someone exploring a given scenario.

> > If AI morality were just:
> >
> > Universal Morality
> >
> > then I postulate that the AI would fail (either
> it
> > could never be created in the first place, or else
> > it
> > would not be stable and it would under go
> > friendliness
> > failure).
> Saying doesn't make it so. Evidence, please?

See above. I just want to assume it as an axiom, and
see what the consequences for FAI would be.

> > But there's a way to make AI's stable: add a small
> > observer-centered component. Such an AI could
> still
> > be MOSTLY altruistic, but now it would only be
> > following Universal Morality as an approximation,
> > since there would be an additional
> observer-centered
> > component.
> That's like taking a perfectly good bicycle and
> putting gum in the chain.

We don't know whether I'm right or not. Again, I'm
just considering the possibility and then looking at
the consequences.

> > So I postulate that all stable FAI's have to have
> > moralities of the form:
> >
> > Universal Morality x Personal Morality
> Saying it doesn't make it so, as much as humans are
> prone to believing something when it is repeated.
> Evidence?

See above. This is a consequence of my assumption.
It's just an exploration of a possible scenario. I'm
not saying it has to be so.

> > Now Universal Morality (by definition) is not
> > arbitrary or observer centered. There is one and
> > only
> > one Universal Morality and it must be symmetric
> > across
> > all sentients (it has to work if everyone does it
> -
> > positive sum interactions).
> This is quite possibly true (though many on SL4
> would
> argue against that)

It has to be true by definition. That's what
'normative altruism' means - IF there is an
non-observer centered morality THEN all sentient FAI's
have to converge on this uniqie morality in the limit
that they thought about morality for long enough.
> > But Personal morality (by definition) can have
> many
> > degrees of freedom and is observer centered.
> There
> > are many different possible kinds of personal
> > morality
> > and the morality is subjective and observer
> > centered.
> Agreed.
> > The only constraint is that Personal Morality has
> to
> > be consistent with Universal Morality to be
> > Friendly.
> > That's why I say that stable FAI's follow
> Universal
> > Morality transformed by (multipication sign)
> > Personal
> > Morality.
> Moralities can't be 'consistent' if they aren't
> identical.

Not true! Are you familiar with mathematics? take a
very simple equation:

x^2 = 4 (x squared equals 4)

What values of x are consistent with this equation?

There are two different answers: -2 and 2

-2 squared = -2 x -2 =4


2 squared = 2 x 2 =4

Now, by analogy, there could be many different
personal moralities consistent with universal
morality. When I say that two moralities are
'Consistent with' each other, all I mean is that they
don't contradict each other.

Here's an example:

Suppose Universal Morality just said: 'Thou shall not

There are many different personal moralities
consistent with that. Here are 3 examples of
extremely simple personal moralities:

'Ice skating is good'
'Coke is good, Pepsi is evil'
'Mountain climbing is good'

Each of these 3 personal moralities is consistent with
'Thou Shall not kill', provided that the people
following the moralities don't kill anyone in the

> > Now an FAI operating off Universal Morality alone
> > (which I'm postulating is impossible or unstable)
> Saying, even repeated saying, doesn't make it so. I
> need evidence!

All I'm doing here is exploring the scenario, not
trying to prove it.

> > would to one and only one (unique) Singularity.
> Non sequitur. AIs, even if they all have the same
> morality, can be quite different.

They wouldn't be 'quite different' if they were
entirely non-observer centered. That was the very
point I was making! FAI's which were 100%
non-observer centered, would all be converging on the
same unique morality. It is only if we added an
observer centered component (a 'personal morality' as
I explained above) that the FAI's would be different.

> > There
> > would be only one possible form a successful
> > Singularity could take. A reasonable guess (due
> to
> > Eliezer) is that:
> >
> > Universal Morality = Volitional Morality
> Quite possibly true.

O.K, so all the FAI's would be going around fulfilling
volitional requests, so long as these requests didn't
hurt anyone else. That's a unique outcome to the

> > That is, it was postulated by Eli that Universal
> > Morality is respect for sentient volition (free
> > will).
> > With no observer centered component, an FAI
> > following
> > this morality would aim to fulfil sentient
> requests
> > (consistent with sentient volition). But I think
> > that
> > such an AI is impossible or unstable.
> Repeating it doesn't make it so. Where is the
> evidence?

I'm exploring the scenario, not trying to prove it.

> > I was postulating that all stable FAI's have a
> > morality of the form:
> >
> > Universal Morality x Personal Morality
> Repeating it doesn't make it correct. Where is the
> evidence?

I'm exploring the scenario. The equation follows IF
an observer centered (personal morality) component has
to be added.

> > If I am right, then there are many different kinds
> > of
> > successul (Friendly) Singularities.
> Agreed.
> > Although
> > Universal Morality is unique, Personal Morality
> can
> > have many degrees of freedom.
> Agreed.
> > So the precise form a
> > successful Singularity takes would depend on the
> > 'Personal Morality' componant of the FAI's
> morality.
> This is like the statement 'Have you stopped beating
> your wife?' - it implies which has not been proven,
> or
> even strongly suggested by evidence.

It's just a consequence of my original axiom. I'm
just exploring the scenario to see what I would imply.

> > Assuming that:
> >
> > Universal Morality = Volition based Morality
> >
> > we see that:
> >
> > Universal Morality x Personal Morality
> >
> > leads to something quite different.
> Agreed.
> > Respect for
> > sentient volition (Universal Morality) gets
> > transformed (mulipication sign) by Personal
> > Morality.
> > This leads to a volition based morality with an
> > Acts/Omissions distinction (See my previous post
> for
> > an explanation of the Moral Acts/Omissions
> > distinctions).
> >
> > FAI's with morality of this form would still
> respect
> > sentient volition, but they would not neccesserily
> > fulfil sentient requests.
> Neither would a Yudkowskian FAI, for example, if
> Saddam Hussein wants to kill everybody.

Well yeah true, a Yudkowskian FAI would of course
refuse requests to hurt other people. But it would
aim to fulfil ALL requests consistent with volition.
(All requests which don't involve violating other
peoples right). But the point I was making was that
there are many such requests consistent with this.
For instance, 'I want to go ice skating', 'I want a
Pepsi', 'I want some mountain climbing qquipment' and
so on and so on. A Yudkowskian FAI can't draw any
distinctions between these, and would see all of them
as equally 'good'.

But an FAI with a 'Personal Morality' component, would
not neccesserily fulfil all of these requests. For
instance an FAI that had a personal morality component
'Coke is good, Pepsi is evil' would refuse to fulfil a
request for Pepsi. That is NOT saying that the FAI
would stop anyone from drinking Pepsi if they wanted
to. Remember, any Personal morality has to be
consistent with Universal morality. If Universal
morality said that people should be allowed to do what
they want so long as they are not hurting anyone, then
the FAI is not allowed to stop people drinking Pepsi,
even though the FAI's personal morality doesn't agree
with it. You see? The 'Personal morality' component
would tell an FAI what it SHOULD do, the 'Universal
morality' componanet is concerned with what an FAI
SHOULDN'T do. A Yudkowskian FAI would be unable to
draw this distinction, since it would have no
'Personal Morality' (Remember a Yudkowskian FAI is
entirely non-observer centerd, and so it could only
have Universal Morality). You could say that a
Yudkowskian FAI just views everything that doesn't
hurt others as equal, where as an FAI with an extra
oberver centered component would have some extra
personal principles.

> > Sentient requests would
> > only be fulfilled when such requests are
> consistent
> > with the FAI's Personal Morality.
> A good reason had better be supplied along with the
> rejections.

See above. I'm just exploring the scenario. This
would be a consequence of my assumption that all FAI's
have to have a 'Personal Morality' componenet.

> > So the 'Personal
> > Morality' component would act like a filter
> stopping
> > some sentient requests from being fulfilled. In
> > addition, such FAI's would be pursuing goals of
> > their
> > own (so long as such goals did not violate
> sentient
> > volition).
> So would a Yudkowskian or entirely volition-based
> AI-
> it would form goals that affected itself instead of
> humans, as long as the goals would lead to helping
> humanity (or sentients in general, after the
> Singularity).

Yeah, yeah, true, but an FAI with a 'Personal
Morality' would have some additional goals on top of
this. A Yudkowskian FAI does of course have the goals
'aim to do things that help with the fulfilment of
sentient requests'. But that's all. An FAI with an
additional 'Personal Morality' component, would also
have the Yudkowskian goals, but it would have some
additional goals. For instance the additinal personal
morality 'Coke is good, Pepsi is evil' would lead the
FAI to personally support 'Coke' goals (provided such
goals did not contradict the Yudkowskian goals).

> > So you see, my form of FAI is a far more
> > interesting and complex beast than an FAI which
> just
> > followed Universal Morality.
> 'Interesting' doesn't mean better, or even possible.

My quation is actually the general solution to FAI
morality. See what I say below.

> > Eliezer's 'Friendliness' theory (whereby the AI is
> > reasoning about morality and can modify its own
> > goals
> > to try to close in on normalized 'Universal
> > Morality')
> > is currently only dealing with the 'Universal
> > Morality' component of morality.
> True- and is there any reason why it shouldn't?

No, but I'm just exploring the consequences of FAI's
with an extra 'Personal Morality' component.

My own theory is actually the more general solution of
FAI morality, because my equation has Eliezer's FAI as
a special case.

Take the equation:

Universal Morality x Personal Morality

and set Personal Morality equal to unity (1)

Then Universal Morality x 1 = Universal Morality

Such an FAI would be equivalent to the Yudkowskian one
(it only has Universal Morality). So you see, my
solution has the Yudkowskian FAI as a special case.

> > But if I am right, then all stable FAI have to
> have
> > an
> > observer-centered (Personal Morality) componant to
> > their morality as well.
> Why?

At point I will try to prove it. Here I am just
giving the general solution to the problem of FAI
morality, and postulating that an FAI with no Personal
Morality component (Personal Morality set to unity in
my equation) would be unstable.

> > So it's vital that FAI programmers give
> > consideration
> > to just what the 'Personal Morality' of an FAI
> > should
> > be.
> Another statement based on an unproven assumption.

I've given the general solution to the problem of FAI
morality. We don't know that 'Personal Morality' set
to unity would be stable. Therefore we have to
consider the case where FAI's have to have a
non-trival 'Personal Morality' component.

> > The question of personal values cannot be
> > evaded
> > if non observer centered FAI's are impossible.
> Even
> > with Universal Morality, there would have to be a
> > 'Personal Morality' componant which would have to
> be
> > chosen directly by the programmers (this 'Personal
> > Morality' componant is arbitrary and
> > non-renormalizable).
> Why, again?

What I said was: 'the question of personal values
cannot be evaaded' IF 'non observer centered FAI's are
impossible'. Personal morality would not be normative
(there would be no unique personal morality
'solution', so an AI could not use reason to get a
solution). We have to consider the possibility.

> > To sum up: my theory is that all stable FAI have
> > moralitites of the form:
> Evidence? You have provided no evidence.

My equation is the general solution to FAI morality,
with the Yudkowskian FAI represented by the special
case : Personal Morality set to unity (1).

> > Universal Morality x Personal Morality
> >
> > Only the 'Universal Morality' can be normalized.
> >
> >
> > =====
> > Please visit my web-site at:
> >
> >
> > Find local movie times and trailers on Yahoo!
> > Movies.
> >
> __________________________________
> Do you Yahoo!?
> Get better spam protection with Yahoo! Mail.

Please visit my web-site at: - Yahoo! Personals
New people, new possibilities. FREE for a limited time.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:46 MDT