Re: Friendliness SOLVED!

From: Thomas McCabe (
Date: Wed Mar 12 2008 - 20:01:34 MDT

On Wed, Mar 12, 2008 at 8:02 PM, Mark Waser <> wrote:
> >> Lay off the drugs.
> No drugs involved. Just a *very* complicated problem with a surprisingly
> simple solution that takes some time and effort to convey.
> The way to clearly disprove my theory (and the ability to be disproven is
> the key to any good theory) is to do one of the two following things:
> Show me how I (or an AGI) can stay true to the declaration and still perform
> a horrible *and* unethical act OR
> Show me a set of circumstances where my Friendliness declaration prevents me
> (or an AGI) from protecting myself

This is a false dichotomy. Neither me, nor other Singularitarians, nor
the AI, nor reality are obligated to choose between your two
predefined options.

> >> You are just talking about rational choice theory, which neither says
> much of anything about human action which is frequently irrational, nor does
> it ensure that an AGI would choose to be friendly.
> I am not *just* talking about rational choice theory. I am proposing a
> formulation that I argue that
> will prevent the an entity who implements from performing any horrible and
> unethical act AND
> once an entity understands this formulation, that entity will see via
> rational choice theory that it is in its own self-interest to implement it

How could this possibly be in the self-interest of, say, a paperclip
optimizer? It will obviously be able to create many more paperclips if
it ignores your (to the UFAI) funny-sounding pulses in fiberoptic
cables and just turns the Earth into a big pile of microscopic

> >> At the very least you could have tried to prove this with symbolic logic
> or something. If you had tried, I'm sure you would find the formulas don't
> add up.
> I disagree. Prove me wrong by doing one of the two things above. That
> should be easy if my theory is as laughably wrong as you believe.

By my count, four different people have now challenged your theory, so
there's plenty of other things to say. Stop repeating this
unreasonable demand; it isn't getting anyone anywhere.

> >> Take one of Eliezer's examples of an AGI that loves smiley faces
> Trust me. I started with that example. The AGI starts with my Friendly
> supergoal of "Do not act contrary to someone's/anyone's goals unless
> absolutely necessary for the fulfillment of a reasonable/rational personal
> goal (explicitly not including generic sub-goals like money, power,
> pleasure, religion, etc.").

Why does it even have this supergoal in the first place?

> It then recognizes that filling the universe
> with smiley faces, as awesome as it is, is also going to be to the detriment
> of all of it's other goals

What other goals? An AI's goal system can be much, much simpler than a
human's. There's no reason why it has to have any other goals.

> since *every* other entity is the universe is
> going to resist rather than assisting it. For a powerful enough,
> single-goal entity that is sure that it *can* overcome all other entities,
> this is not going to stop it -- but this is a fantasy edge-case that we
> should be able to easily avoid.

Actually, it is probably the default case, and a large number of us
are operating off that assumption (it's the conservative scenario).

> *Any* sufficiently intelligent multi-goal
> entity (or any entity that realizes that it is not powerful enough to take
> on *the entire universe*) is going to recognize that this path is probably
> seriously sub-optimal for it fulfilling the vast majority of it's goals.
> >> Acid test over, you lose.
> I don't concur. Please try again. The easiest path is to disprove one of
> the two things above.
> And thank you for spending the time to answer.

 - Tom

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT