From: Stuart Armstrong (firstname.lastname@example.org)
Date: Tue Mar 11 2008 - 06:46:38 MDT
I've recently finished a paper on a possible method for designing the
moral system of a developing AI. The paper is at
http://www.neweuropeancentury.org/GodAI.pdf , and it too long to copy
People recomended that I put it on this list, and beg for comments on
it. Any such comments are most welcome, including recursive comments
(i.e. comments on where to go to get people's comments). Especially
needed would be advice to put in section 7, which lists the ways in
which the paper could be wrong.
The basic assumptions are: self improving AI is possible. These
machines will be far beyond what we can imagine, and will have a basic
understanding of humans and human language.
The paper argues that simple solutions, such as a single wish, or even
a more complicated coherent extrapolated volition goal, will not be
enough to build a friendly AI.
The entire idea of limited humans guiding a supremely intelligent AI
(a GodAI, to use the term in the paper) seems ridiculous.
Nevertheless, the paper argues that it can be done. The
trustworthiness of the AI will be checked at each step by the AI
itself, and a "chain" of gradually lesser intelligences stretching
back down to us.
The interactions between the GodAI and humans will be very
complicated, but will mainly consist of the GodAI figuring out the
likely consequences of particular moral systems, and humans figuring
out whether this is an acceptable result. This plays to the lack of
symmetry of human cognitive abilities: our complete cluelessness at
predicting the consequences of certain decisions, but our skill at
judging whether the consequences are sound. The GodAI will complement
our deficiencies, and, with open and imaginative minds, we can guide
The procedure does not fail catastrophically if the intelligence of
the top AI turns out to be limited; in fact, it is rather suited to
such an eventuality. It is also robust.
The procedure is, however, both political and imperfect: political
because the end result will depend on decisions made my humans along
the way, and on the manner in which these decisions are made, and
imperfect because the aim is not to find the single most ideal
solution, but just a suitably great one (and, not incidentally, a
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT