Re: Recipe for CEV (was Re: Morality simulator)

From: Matt Mahoney (matmahoney@yahoo.com)
Date: Sat Nov 24 2007 - 18:40:18 MST


--- "Eliezer S. Yudkowsky" <sentience@pobox.com> wrote:

> Matt Mahoney wrote:
> > I agree it will wipe out the human race, but I disagree that compression
> is
> > not CEV. CEV is a definition of friendliness, not a solution.
> >
> > Using http://www.overcomingbias.com/2007/11/complex-wishes.html as an
> example,
> > a good data compressor will know what you mean when you say "get my mother
> out
> > of the burning building", in the sense that if
> >
> > s1 = "'Do you mean to get her out alive?' 'Yes'"
> > s2 = "'Do you mean to get her out alive?' 'No'"
> >
> > then string s1 has a higher probability, and therefore a shorter code
> length
> > than s2, and likewise for related questions with common sense answers,
> because
> > common sense knowledge improves compression of dialog with common sense
> > answers.
>
> If you define the extrapolation of a 'Yes' answer in dialogue as your
> utility function, that creates instrumental pressure to emit extremely
> persuasive arguments or even to simply reprogram brains to answer
> 'Yes', there being no hard line between these two options.

The utility function I propose is the negative of the size of a program D that
outputs a fixed data set X, where X is a large amount of data of human origin,
such as 1 GB of text, or the Internet, or 10^11 years of video.

Without loss of generality we may implement D as a model P and an arithmetic
coder, and write a corresponding compressor C using the same model P, such
that D(C(X)) = X. In this case, the utility function is U(X) = -(|D| +
|C(X)|) = -log(P(X)), the negative of the size of the decompressor D plus the
compressed data C(X).

My claim is that the likelihood that P(s1|X) > P(s2|X), or equivalently,
|C(X,s1)| < |C(X,s2)|, increases as U(X) increases, or as |X| increases, or
the "quality" of X increases, and that this also applies to similar questions
requiring common sense. Quality is hard to define, but roughly I mean data
encoding information important to humans, for example the text of high quality
books and articles as opposed to random noise.

The model P will distinguish between descriptions (in words or pictures) of
friendly and unfriendly behavior by assigning higher probabilities to the
friendly descriptions. This is different than distinguishing between friendly
and unfriendly behavior. I don't claim that such a thing is possible.

-- Matt Mahoney, matmahoney@yahoo.com



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT