RE: guaranteeing friendliness

From: H C (
Date: Wed Nov 30 2005 - 16:02:00 MST

"By the way, I believe that we will create friendly AI, but
we will also (eventually) create unfriendly AI, either by
accident or by design. "

I'm just curious where you came up with that.

By my understanding, these two are mutually exclusive... in the sense that
if you actually create an FAI, it will quickly ensure that nobody create a
UFAI (major existential risk), and if you create a UFAI first, then that
tends to imply humans are, well, screwed.


>From: "Herb Martin" <HerbM@LearnQuick.Com>
>To: <>
>Subject: RE: guaranteeing friendliness
>Date: Wed, 30 Nov 2005 14:04:44 -0800
> > -----Original Message-----
> > From: Christian Rovner
> > To:
> >
> > Richard Loosemore wrote:
> > >
> > > I repeat: why is extreme smartness capable of extreme persuasion?
> >
> > Persuasion is a special case of reality-optimization (aka
> > goal-achieving).
> >
> > If you are asking why extreme smartness is capable of
> > achieving goals,
> > then I really don't know--otherwise I would be programming
> > AI. Of couse
> > this is not obvious at all.
>Persuasion is a teachable skill (to human level intelligences.)
>Much of the information for teaching and learning this skill set
>is documented in books and online.
> > If you are asking why extreme smartness is capable of achieving this
> > kind of goal in particular, I'll ask in return: Why not? Is there
> > something special about a human mind that makes it unpredictable, no
> > matter how detailed and accurate a (causal) model we use?
>Persuasion is a "statistical skill" that is best applied to
>individuals by using feedback and changes in strategy based
>on reaction to previous tactics.
>But make no mistake, persuasion is at least as teachable as
>an athletic sport skill set -- this analogy is picked
>because if you teach boxing or basketball no technique will
>be effective against an arbitrary opponent 100% of the time,
>but improvements are real and measurable. The same is
>true for persuasion.
>Even persuasion in the mass media is usually applied (today)
>by measuring feedback and adjusting the message where feasible.
>It's cheaper for advertisers to target their persuasive messages
>this way than to blindly blast the same (less than optimal)
>message, unless a threshold success rate is reached without need
>for adjustment. Those who don't reach such thresholds go out
>of business in most cases.
>The most surprising persuasion technique [to me] is that of
>"giving a reason" since it apparently does require the reason
>make much sense or even have any substance.
>The classic experiment was to request to be allowed to go "jump
>the line" at a copier using, (separately) "no reason", "because
>[good reason goes here]", and "because I am in a hurry" or
>"because its important that I go first" type of 'reasons'.
>'Reasons' worked MUCH better. 'Nothing reasons' were just about
>as effective as giving real information.
>Someone mentioned hypnosis (earlier in this thread I believe),
>and this is also a teachable skill (again to human level
>[And lest anyone think that hypnosis is fantasy, it's
>effectiveness has been documented scientifically for
>such uses as controlling bleeding -- a quite repeatable
>and testable result.]
>Hypnosis CAN be used to obtain behavior against the interests
>or morals of the subject but doing so it VERY difficult and
>generally requires misrepresenting the situation rather than
>directly ordering counter-interest behavior (if one expects
>reasonably reliable results.)
>We cannot even guarantee our children will grow up to be
>responsible human beings.
>We can follow generally accepted guidelines, and teach
>our children moral or ethical behavior, but we cannot
>guarantee that behavior completely; we only know that
>generally such parenting leads to children who become
>good human beings more often than to the opposite result.
>It is not likely that a "programmer" could even review
>enough of a (truly) human level intelligence to understand
>where things go wrong.
>It's not possible to create bug free software; imagine
>trying to just FIND the bugs in a large program like Microsoft
>Word, or even Windows itself (and this is true for Linux too
>but notice that Linux has the advantage of Open Source which
>is precisely what you CANNOT do if you must guarantee
>friendliness which includes guaranteeing that no one modifies
>the code in unfriendly ways.)
>When you couple this with the likelihood that human level
>intelligence will likely have neural nets (or similar nets)
>and genetic algorithms learned through training and
>adaptation and having no direct high level language
>representation then it is unlikely that the programmer
>can either read the source code OR even review ALL of it.
>Current computer programs run on hardware with approximately
>10^9 memory locations (4 x 10^9 is the current limit for most
>PCs, but most don't have all the memory that is possible nor
>can the programs use that much.) The operating systems
>alone use around one tenth (10^8) of that and it is unlikely
>that any one programmer could review just that.
>Current estimates expect that human level intelligence will
>require IN EXCESS of 10^15 memory locations -- about ten
>million times (10^7) more ( than current operating systems.
>Other estimates suggest it might take a thousand or more
>time as much for such intelligence levels so it is unlikely
>that anyone could ever review such a large body of code once
>it is made self-improving.
>It is practically impossible to "guarantee friendly behavior"
>OVER TIME -- to the extent that we are successful, our guard
>will tend to drop.
>Human beings are lazy -- taking security precautions against
>imaginary threats is seldom maintained. (Part of the reason
>our current security precautions against terrorists are
>doomed to failure if we don't remove the terrorists through
>offensive and strategic actions rather than purely defensive
>No rules will be 100% safe if the program learns and adapts.
>Human beings probably cannot even agree on what is friendly
>behavior -- to the religious fanatic killing you to save the
>world or praise some deity may even constitute "friendly
>behavior" from this world view. Total non-interference to
>the point of allowing suicide and other self-destructive
>behavior is likely acceptable to (most) Libertarians.
>The point here is (of course) NOT the particular beliefs used
>as examples but the fact that different programmers could
>not even agree on correct "friendly behavior".
>Doctors don't always agree on the behaviors that comply with
>the Hippocratic Oath, and that one is quite straightforward
>as human creeds go.
>And how many people would include allowing doctors to assist
>death if it relieves greater suffering, while others would
>read "Do no harm" literally and insist euthanasia is ALWAYS
>Pick any serious moral or ethical belief and you will likely
>find someone who would disagree under some particular set
>of circumstances.
>Killing is not always murder (e.g., defense of a child or other
>defenseless person from the criminally violent).
>But, notice that a Quaker might disagree with the above
>sentence -- and do so honestly and consistently.
>Allowing someone to live can constitute torture. How many
>terminal bone cancer patients are quietly helped to die?
>Should a truly friendly AI prevent human beings from
>engaging in ANY dangerous behavior (including passive
>or long term behavior like failure to take vitamins or
>overeating), or should it absolutely refuse to interfere
>with self-determination to the point of allowing suicide
>and other clearly destructive behavior?
>Most of us would expect the answer lies somewhere between
>the two extremes but few of us could agree where that
>We might even find that our answers to this question change
>over time or even day to day (and on an absolute basis, i.e.,
>separately from the context).
>By the way, I believe that we will create friendly AI, but
>we will also (eventually) create unfriendly AI, either by
>accident or by design.
>Herb Martin

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:53 MDT