Re: Chaining God

From: Rolf Nelson (rolf.h.d.nelson@gmail.com)
Date: Tue Mar 11 2008 - 20:08:02 MDT


I can think of definitions of trustworthy that are
useful-to-have-implemented (like "won't kick off a line of descendants that
will eventually kill me") and definitions of trustworthy that are
practical-to-measure (like "won't stab me in the next 30 seconds"), which do
you mean when you use the word "trustworthy" in the paper?

Same thing with "intelligence", is intelligence "the ability to create
answers that human questioners will express approval of, assuming these
questioners are given the instructions to evaluate the answers as
objectively as possible", or is it the ability to accomplish long-term
goals? (Do these AI's have any long-term goals in the first place?)

As you know, usually the power of an induction approach is that there's an
invariant such that if the invariant is true at step k, the invariant
is also true at step k+1. For example, each step passes its initial
Intelligence Test at least as well as the previous step did, so "Measured IQ
> x" is an invariant. Are there supposed to be other such invariants in the
system that I'm missing, hidden among the words like 'honest',
'trustworthy', and 'safe'? In other words, why do you believe the proposed
system wouldn't take a random walk away from "humanity lives" scenarios and
towards "humanity dies" scenarios?

The chaining system looks isomorphic to a subset of self-improving systems,
where the step

A -> A + A' (AI A creates AI A' and continues running)

maps to

A -> [C + A + A'] (AI rewrites its code to become a different AI, which is
itself a composite of A, A', and a control module that gives A veto power
over A'). Framed this way, is there a short explanation of why this
limitation (on how an AGI can modify itself) is helpful? If this limitation
is required to keep some kind of invariant, it might be better to specify
the invariant directly, and then show how this limitation is guaranteed to
preserve the invariant in a way that no larger subset of self-modifications
can be guaranteed to preserve this invariant.

-Rolf



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT