Re: Friendliness not an add-on

From: Ben Goertzel (ben@goertzel.org)
Date: Sat Feb 18 2006 - 21:46:02 MST

Next message: Phillip Huggan: "Think of it as AGI suiciding, not boxing"
Previous message: Eliezer S. Yudkowsky: "Re: Friendliness not an add-on"
In reply to: Eliezer S. Yudkowsky: "Re: Friendliness not an add-on"
Next in thread: Phillip Huggan: "Think of it as AGI suiciding, not boxing"
Reply: Phillip Huggan: "Think of it as AGI suiciding, not boxing"
Reply: H C: "Re: Friendliness not an add-on"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi Eli,

> If you're going to charge straight ahead and develop an unsafe system in
> the hopes you can bolt on a module that makes it safe, I've got to ask
> you, just how exactly does this module work?

This is a very fair and good question, and unfortunately I am not
going to answer it right now. The answer is a somewhat complex one,
involving ideas that have been worked out at varying levels of detail,
and I am now facing a big backlog in terms of writing down and working
out details of my AGI-related ideas. I will get to writing this
aspect up in detail, but probably not till early 2007.

I can see you might argue that it makes sense to write this aspect up
BEFORE the others, but, in terms of the particular approaches I'm
taking, the reverse approach is actually more sensible....

I will note, however, that there are many cases in computer science
where VERIFYING a solution to a problem is a lot computationally
simpler than FINDING the solution, and is addressable by entirely
different algorithms...

>What kind of module are you visualizing that's simpler than a
> full FAI and can check the output of an evolutionary programmer, and
> does this trick require constraints built into the EP module?

I am not sure if they require constraints or not.

If a constraint were required, it would simply be that the EP module
log all its intermediary results ...

> Since the safety of the whole project depends on this verifier being
> practical, and otherwise it ends up being literally worse than nothing,
> maybe you ought to build the verifier first - just to make sure it works?

The verifier will be built before Novamente is given strong
self-modification abilities. But building the kind of Friendliness
verifier I'm thinking of is almost surely harder than building
toddler-level AGI, and so we are working toward the latter from a
practical implementation/testing point of view while working on
Friendliness verification and other more advanced topics from a purely
theoretical perspective at the moment.

The reason we do not consider this unsafe is basically that we are
quite sure our architecture will not permit a toddler-level AGI to
undergo any kind of hard takeoff. We have not formally proved this,
but nor have I (or you) formally proved that my toaster will not
undergo a hard takeoff...

-- Ben

Next message: Phillip Huggan: "Think of it as AGI suiciding, not boxing"
Previous message: Eliezer S. Yudkowsky: "Re: Friendliness not an add-on"
In reply to: Eliezer S. Yudkowsky: "Re: Friendliness not an add-on"
Next in thread: Phillip Huggan: "Think of it as AGI suiciding, not boxing"
Reply: Phillip Huggan: "Think of it as AGI suiciding, not boxing"
Reply: H C: "Re: Friendliness not an add-on"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:55 MDT