Re: Domain Protection

From: Sebastian Hagen (sebastian_hagen@gmx.de)
Date: Tue May 10 2005 - 06:01:53 MDT


Russell Wallace wrote:
>>Either 'sentient life' matters to humanity in general, in which case I'd
>>expect the Extrapolated Collective Volition of humanity to ensure that
>>it continues to exist (or fail harmlessly in case of nonconvergence,
>>computational intractability, etc)
>
>
> Or humanity cares about the continued existence of sentient life and
> takes decisions it _thinks_ will ensure that; and the RPOP flashes a
> picture on the screen that lets the Last Judge convince himself it
> will do so; and it turns out otherwise.
That's a possible outcome, but assuming that ensuring the infinitely
continued existence of sentience is even physically possible (our
universe may or may not allow for an infinite amount of computation to
be carried out) I'd guess (yes, that is unfortunately what I'm doing
here; I'd like to use math, but the empirical data about these cases is
rather limited) that a CV that doesn't clearly and explicitely fail
would have a better chance at making it work than unupgraded humans
realizing a Domain Protection scheme; more correctly, I consider the
expected utility in the first case to be higher.

>>or it doesn't, in which case I don't
>>have any particular reason to care if it is optimized away.
>
>
> So you don't personally care if everything of value is destroyed?
No; I don't personally care if everything a technically valid CV of
humanity decides to be not worth being preserved is destroyed. I'm in no
position to determine whether 'sentience' matters; I don't even really
understand what it is. If the CV decided that it doesn't matter (not
that I consider that especially likely), and the last judge agreed, I'd
accept these decisions as most likely right.

>>The conventional assumption in this case is that unupgraded humans, even
>>with some predictive help from a RPOP, are not capable of reliably
>>making good decisions about rules that are supposed to stay in force
>>forever.
>
>
> Conventional assumption? What convention would this be?
The one of assuming the worst reasonably likely possibility. To quote
<http://homepage.eircom.net/~russell12/dp.html>: "We don't really know
what we're doing - we're fallible human beings, we don't have the gift
of precognition." - Indeed we are, and the assumption that leads to safe
failures in this case is that our cognitive deficiences are sufficient
to prevent us from reliably making good decisions (we may still make
them by accident, though the probability for that happening in any
individual case isn't exactly great) about rules supposed to stay valid
indefinitely. Making any other assumption about our current abilities
creates the possibility for an unsafe failure on this point.

>>This applies to the rules for individual domains, but it also
>>applies to the whole idea of domain protection.
>
> If you have a better idea, it's not too late.
About rules supposed to stay valid indefinitely? Heck, no - I'm just an
ordinary human being, whatever I come up with would likely lead to
catastrophic results in the long term.

>>> If there are critical differences in the values of different subsets of
>>> the population the CV still has the option implement an individual or
>>> domain protection system with permanently fixed rules; it would be in a
>>> much better position to make that decision than present-day humans.
>
>
> On what basis do you claim that?

Quoted from <http://sl4.org/wiki/CollectiveVolition>:
"In poetic terms, our collective volition is our wish if we knew more,
thought faster, were more the people we wished we were, had grown up
farther together; where the extrapolation converges rather than
diverges, where our wishes cohere rather than interfere; extrapolated as
we wish that extrapolated, interpreted as we wish that interpreted.
...
I arbitrarily declare the poetic term "think faster" to also cover
thinking smarter, generalizing to straightforward transforms of existing
cognitive processes to use more computing power, more neurons, et cetera."

This specifies general enhancements to the cognitive abilities of
humans, an increase in their knowledge about reality, and an iterated
improvement of whatever other attributes the extrapolated humans would
like to have improved.
According to widely available empirical data, both better cognitive
abilities and available knowledge lead to better decisions. A transhuman
can more effectively and efficiently manipulate reality than a being
with human-level intelligence. A human expert (i.e. a human with more
knowledge) on a specific subject will usually outperform a novice of
similar general intelligence - there are examples of negative
correlation between knowledge and effective ability, but those are
caused by fixable bugs in the human cognitive architecture.

Generalizing from a lot of real evidence, I'd expect the output from a
(working) CV process to be significantly better than that of any group
of unupgraded humans. If Domain Protection is the implementation with
the highest expected utility, a CV implementation will either fail
safely (assuming the friendliness architecture works, as has been
mentioned) or decide on implementing Domain Protection.

Sebastian Hagen



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT