Re: guaranteeing friendliness

From: sam kayley (
Date: Tue Nov 29 2005 - 19:03:09 MST

----- Original Message -----
From: "Richard Loosemore" <>
To: <>
Sent: Wednesday, November 30, 2005 1:13 AM
Subject: Re: guaranteeing friendliness

> The short summary of my responses (laid out in detail below) is that you
> have only repeated your assertion that a very smart AGI would
> "obviously" be able to convince us to do anything it wanted. You have
> given no reason to believe in this other than you, personally, declaring
> it to be a rejected idea.
> I repeat: why is extreme smartness capable of extreme persuasion?
> This is not even slightly obvious.

Are there flaws in the human mind so some sequence of sensory inputs will
cause a person to do something they wouldn't intend to do otherwise?

These could range from flashing light epilepsy, requiring timing control and
neural level knowledge of humans, to deceptive arguments taking advantage of
the kinds of heuristics humans use to evaluate plausibility of arguments, to
emotional manipulation.

There is quite a range here, requiring different types and amounts of
information about humans (not all of which comes from conversation, the
initial architecture of the AI says something about human thought
processes), the particular victim, access to sensory channels, and ways of
thought on the part of the AI to exploit. It only takes one trick that
works, known to us as yet or not and the AI has won.

I am not convinced that a persuasive trick discoverable and usable from
inside an AI box exists. I am also not sure one doesn't. Given the stakes,
erring on the side of caution seems sensible.

And if discovering and using such a trick is possible in theory, by
definition a superintelligent AI can do it.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:53 MDT