Re: Fundamentals - was RE: Visualizing muddled volitions

From: Eliezer Yudkowsky (
Date: Thu Jun 17 2004 - 11:26:49 MDT

Samantha Atkins wrote:

> So, if you fuck it up and the EV is grossly inaccurate then humanity is
> basically eternally screwed.

Yes, that is correct.

> And this is supposed to be safer than
> creating a fully conscious SAI how?

You suppose it is not possible to fuck up the task of creating an
independent humane SAI?

> The Judge of Last Resort must
> somehow be the future persons who must live under this thing.

Yes, that is one major *difference* between a independent humane SAI and a
collective volition. You can always shut off a collective volition if you
desire to do so is not muddled according to a collective volition that has
no built-in tendency to rationalize or self-protect on that type of
extrapolation. The question is one of simple fact, and it is just, "Are
the people who say 'No!' still going to agree with that decision in a few

> Else it is a the biggest one time gamble in history.

I expect it's the biggest one-time gamble in the history of any given
intelligent species that makes it that far. Whichever road you choose,
it's still the biggest gamble, ever. That is why I take it seriously, and
why I think that certain others are not taking it seriously.

>> Including human infants, I assume. I'll expect you to deliver the
>> exact, eternal, unalterable specification of what constitutes a
>> "sentient" by Thursday. Whatever happened to keeping things simple?
> This is not required. Being able to shut down the optimization if it
> gets wildly out of hand is required.

Current plans call for around three off-switches:

1) The programmer verification process, viewing the dynamics;
2) The Last Judge, viewing the outcome;
3) The ability of our collective decision to shut down the collective
volition providing our collective decision is not defined by the collective
volition as muddled.

> If this can't be done why on
> earth would anyone trust you or a hundred persons equally bright and
> with excellent intentions to not fuck it up drastically?

It has to be reduced to a pure technical issue. It's not a question of who
deserves trust. It's a question of what safeguards you can take. The
former question is not solvable and not helpful; the latter question is
solvable and helpful.

>> Could you please elaborate further on all the independent details you
>> would like to code into eternal, unalterable invariants? If you add
>> enough of them we can drive the probability of them all working as
>> expected down to effectively zero. Three should be sufficient, but
>> redundancy is always a good thing.
> Not at all. Make sure everyone is backed up at all times and give them
> free choice but with nudges/reminders/more grown-up advise at whatever
> level each one is willing/desirious of taking.

So, you and Brent Thomas can fight it out about which of these is the One
Fundamental Right we all need, and then I'll take on the winner?

> Set things up so they
> can't do themselves in ultimately (although it make look like they can
> to them). Keep it going long enough for each to grow up.

I do not think we are so wise, as to decide this thing. Too many
unforeseen consequences. We can't evaluate the real effect of our actions,
as opposed to the imagined effect.

>> It's not about public relations, it's about living with the actual
>> result for the next ten billion years if that wonderful PR invariant
>> turns out to be a bad idea.
> Instead we live with an original implementation of EV extraction and
> decision making for the next 10 billion years without any possibility of
> a reset or out? Hmm.

No, hence the whole distinction between initial dynamic, successor dynamic,

>> Not under your system, no. I would like to allow your grownup self
>> and/or your volition to object effectively.
> But that being exists only within the extrapolation which may in fact be
> erroneously formulated.

Right. If you fuck up the extrapolation, you're screwed. If you try to
build in a separate emergency exit, and you fuck up the emergency exit,
you're screwed. If you build an independent humane mind and you fuck that
up, you're screwed. Hence, keep it simple, and keep it technical, because
on moral problems you can fuck up even if everything works exactly as planned.

>> I suppose that if that is the sort of solution you would come up with
>> after thinking about it for a few years, it might be the secondary
>> dynamic. For myself I would argue against that, because it sounds
>> like individuals have been handed genie bottles with warning labels,
>> and I don't think that's a good thing.
> Genie bottle with warning labels AND the ability to recover from errors
> wouldn't be so bad.

It would be a total shift in human society and an enormous impact on every
individual. Maybe we want to take it slower than that?

>> The title of this subject line is "fundamentals". There is a
>> fundamental tradeoff that works like this: The more *assured* are
>> such details of the outcome, even in the face of our later
>> reconsideration, the more control is irrevocably exerted over the
>> details of the outcome by a human-level intelligence. This holds
>> especially true of the things that we are most nervous about. The
>> more control you take away from smarter minds, for the sake of your
>> own nervousness, the more you risk damning yourself. What if the
>> Right of Withdrawal that you code (irrevocably and forever, or else
>> why bother) is the wrong Right, weaker and less effective than the
>> Right of Withdrawal the initial dynamic would have set in place if you
>> hadn't meddled?
> Not unless you design the system so that eternal damnation is possible!
> Without the ability to opt-out or try something different or recover
> from any and all errors eternal damnation will always be possible.

Collective volition has this, and it is one of the primary motivations.
And yes, this requires that the initial dynamic work satisfactorily, just
as any other method requires that the technical solutions work satisfactorily.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT