Defining friendliness (was Re: Can't afford to resuce cows)

From: Matt Mahoney (
Date: Mon Apr 28 2008 - 16:56:02 MDT

--- Stuart Armstrong <> wrote:

> But we still have to solve the friendliness problem long before we
> begin to worry about the details of that uploaded world...

Before we can solve it, we have to define it.

I am aware that CEV is a definition. However, it is human-centered.
Should AI grant the extrapolated wishes of animals? If so, which
species? What about embryos? What about convicts?

In the future we will have to ask much harder question like this about
machines that are "sort of" human. What about the "original" you when
you step into a teleportation booth? What about robot slaves who look
and act human except that they only want to serve us? Does it matter
if the robot has a copy of the memories of one individual, or a blend
of many people? Should AI "free" the robot by reprogramming its
motivational system because afterwards it will be "happier" that it

I am not looking for answers to these questions, because there are
thousands more like it, and it gets tedious. I know this has been
discussed before. The answer is that the AI will be thousands of times
smarter than us so it will just figure out all the right answers. I
don't buy it. The trend is in the opposite direction. In the ancient
past there was no dispute about abortion. It did not exist. Now we
dispute stem cell research and cloning. In the ancient past there was
no dispute about animal rights. They had none. Now we have PETA. In
the present there is no dispute about machine rights. They have none.
Do you really expect future machines to agree?

Friendliness is defined in the context of ethical beliefs, for example,
the "average ethical beliefs of all humans currently alive". This is
reasonable in our current world where there is a sharp line between
human and nonhuman. As the line gets fuzzier, those on the inside will
make decisions about whom to include or exclude. Do we include
uploads? Do multiple copies of uploads have more rights than a single
copy? How is this decision making process stable against growing to
include insects and nanobots or shrinking to include just a single
godlike AI?

How is volition defined when goals are programmable? I know that an AI
should not want to modify its own goals, because if it could it would
program itself to be a happy idiot. Evolution has eliminated this
capability in ourselves. But our ethics allow us to program the goals
of others. We want our children to not want to cheat, steal, or lie.
We want drug addicts to not want drugs. So what's wrong with an AI
reprogramming you to be a happy idiot?

I know about apotheosis.

I don't buy it. We feel fear, pain, and suffering because our
ancestors who didn't did not pass on their DNA. Happiness is
increasing utility, dU(x)/dt. It is mathematically bounded over finite
x, sorry. We could eliminate the source of our fears and suffering but
then we would not appreciate how much better the world is. Do you
really think you would be happier if you could have everything you
want? A simulated world with a magic genie? In the real world you are
just a happy idiot, just a program with no I/O.

-- Matt Mahoney,

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT