Re: Two draft papers: AI and existential risk; heuristics and biases

From: Bill Hibbard (
Date: Mon Jun 05 2006 - 12:01:05 MDT


> These are drafts of my chapters for Nick Bostrom's forthcoming edited
> volume _Global Catastrophic Risks_. I may not have much time for
> further editing, but if anyone discovers any gross mistakes, then
> there's still time for me to submit changes.
> The chapters are:
> . . .
> _Artificial Intelligence and Global Risk_
> The new standard introductory material on Friendly AI. Any links to
> _Creating Friendly AI_ should be redirected here.

In Section 6.2 you quote my ideas written in 2001 for
hard-wiring recognition of expressions of human happiness
as values for super-intelligent machines. I have three
problems with your critique:

1. Immediately after my quote you discuss problems with
neural network experiments by the US Army. But I never said
hard-wired learning of recognition of expressions of human
happiness should be done using neural networks like those
used by the army. You are conflating my idea with another,
and then explaining how the other failed.

2. In your section 6.2 you write:

  If an AI "hard-wired" to such code possessed the power - and
  [Hibbard, B. 2001. Super-intelligent machines. ACM SIGGRAPH
  Computer Graphics, 35(1).] spoke of superintelligence - would
  the galaxy end up tiled with tiny molecular pictures of

When it is feasible to build a super-intelligence, it will
be feasible to build hard-wired recognition of "human facial
expressions, human voices and human body language" (to use
the words of mine that you quote) that exceed the recognition
accuracy of current humans such as you and me, and will
certainly not be fooled by "tiny molecular pictures of
smiley-faces." You should not assume such a poor
implementation of my idea that it cannot make
discriminations that are trivial to current humans.

3. I have moved beyond my idea for hard-wired recognition of
expressions of human emotions, and you should critique my
recent ideas where they supercede my earlier ideas. In my
2004 paper:

  Reinforcement Learning as a Context for Integrating AI Research,
  Bill Hibbard, 2004 AAAI Fall Symposium on Achieving Human-Level
  Intelligence through Integrated Systems and Research

I say:

  Valuing human happiness requires abilities to recognize
  humans and to recognize their happiness and unhappiness.
  Static versions of these abilities could be created by
  supervised learning. But given the changing nature of our
  world, especially under the influence of machine
  intelligence, it would be safer to make these abilities
  dynamic. This suggests a design of interacting learning
  processes. One set of processes would learn to recognize
  humans and their happiness, reinforced by agreement from
  the currently recognized set of humans. Another set of
  processes would learn external behaviors, reinforced by
  human happiness according to the recognition criteria
  learned by the first set of processes. This is analogous
  to humans, whose reinforcement values depend on
  expressions of other humans, where the recognition of
  those humans and their expressions is continuously
  learned and updated.

And I further clarify and update my ideas in a 2005
on-line paper:

  The Ethics and Politics of Super-Intelligent Machines

Please adjust your discussion of my ideas to:

  1. Not conflate my ideas with others.
  2. Not assume a poor implementation of my ideas.
  3. Not critique my old ideas when they have been
     replaced by newer ideas in my publications.

Thank you,

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT