Re: [sl4] Just how coherent does CEV have to be?

From: Stuart Armstrong (dragondreaming@googlemail.com)
Date: Fri Oct 24 2008 - 09:33:09 MDT


The coherence problem will be a practical one - does the extrapolated
volition of humanity converge under realistic constraints and models?
We'll need more theory to solve that issue.

My objection has always been to the "extrapolated" aspects of it. It
seems entirely credible that a CEV constructed from me would conclude
that humanity should be killed off for some reason. I wouldn't follow
it down this path, and I don't see why I should.

Most CEV advocates claim that simple caveats like "don't kill off all
humans" should be added. Eliezer mentioned a "final judge" which would
decide whether to implement the CEV or not, a conceptually similar
idea (though in practice much better).

But if the CEV, in theory, can reach the "wrong" decision in ways we
can guard against, it can also reach the "wrong" decision in ways that
we cannot imagine. If I were an advanced AI, I could certainly come up
with a model of CEV which sounds irresistible when presented, but is
actually terminaly flawed (from the human perspective). How can we be
confident that the CEV will not converge on such a model? What is the
size of such models in potential CEV space? And why should we trust a
CEV any more than we trust a random friendly-looking AI?

NB: Obviously, I don't object to medium distance CEV's:
Medium distance: An extrapolated volition that would require extended
education and argument before it became massively obvious in
retrospect.
Only to large distance ones:
Long distance: An extrapolated volition your present-day self finds
incomprehensible; not outrageous or annoying, but blankly
incomprehensible.

Stuart



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT