RE: ITSSIM (was Some new ideas on Friendly AI)

From: Ben Goertzel (ben@goertzel.org)
Date: Tue Feb 22 2005 - 13:52:40 MST

Next message: Ben Goertzel: "RE: ITSSIM (was Some new ideas on Friendly AI)"
Previous message: Eliezer S. Yudkowsky: "A Bay Area transhumanist get-together at 5PM, Sunday March 6th?"
In reply to: David Hart: "Re: ITSSIM (was Some new ideas on Friendly AI)"
Next in thread: Ben Goertzel: "RE: ITSSIM (was Some new ideas on Friendly AI)"
Reply: Ben Goertzel: "RE: ITSSIM (was Some new ideas on Friendly AI)"
Reply: David Hart: "Re: ITSSIM (was Some new ideas on Friendly AI)"
Reply: Thomas Buckner: "HTML E-mail"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi,

Your last paragraph indicates an obvious philosophical (not logical)
weakness of the ITSSIM approach as presented.

It is oriented toward protecting against danger from the AI itself, rather
than other dangers. Thus, suppose

-- there's a threat that has a 90% chance of destroying ALL OF THE UNIVERSE
with a different universe, except for the AI itself; but will almost
certainly leave the AI intact
-- the AI could avert this attack but in doing so it would make itself
slightly less safe (slightly less likely to obey the ITSSIM safety rule)

Then following the ITSSIM rule, the AI will let the rest of the world get
destroyed, because there is no action that it can take without decreasing
its amount of safety.

Unfortunately, I can't think of any clean way to get around this problem --
yet. Can you?

-- Ben

  -----Original Message-----
  From: owner-sl4@sl4.org [mailto:owner-sl4@sl4.org]On Behalf Of David Hart
  Sent: Tuesday, February 22, 2005 1:12 AM
  To: sl4@sl4.org
  Subject: Re: ITSSIM (was Some new ideas on Friendly AI)

Hi Ben,

I understand how ITSSIM is designed to "optimize for S", and also how it
might work in practice with the one of many possible qualitative definitions
of "Safety" being the concept that if we [humans] desire that our
mind-offspring respect our future "growth, joy and choice", the next N+1
incrementally improved generation should want the same for themselves and
their mind-offspring.

In such a system, supergoals (like, e.g., CV) and their subgoals,
interacting with their environments, generate A (possible actions), to which
R (safety rule) is applied.

I'm very curious to learn how S and SG might interact -- might one
eventually dominate the other, or might they become co-attractors?

Of course, we're still stuck with quantifying this and other definitions
for "Safety", including acceptable margins.

NB: I believe we cannot create an S or an SG that are provably invariant,
but that both should be cleverly designed with the highest probability of
being invariant in the largest possible |U| we can muster computationally
(to our best knowledge for the longest possible extrapolation, which may,
arguably, still be too puny to be comfortably "safe" or "friendly").

Perhaps the matrix of SB, SE, SN and SGB, SGE, SGN should duke-it-out in
simulation. Although, at some point, we will simply need to choose our S and
our SG and take our chances, taking into account the probability that Big
Red, True Blue, et al, may not have terribly conservative values for S or SG
slowing their progress. :-(

David

Next message: Ben Goertzel: "RE: ITSSIM (was Some new ideas on Friendly AI)"
Previous message: Eliezer S. Yudkowsky: "A Bay Area transhumanist get-together at 5PM, Sunday March 6th?"
In reply to: David Hart: "Re: ITSSIM (was Some new ideas on Friendly AI)"
Next in thread: Ben Goertzel: "RE: ITSSIM (was Some new ideas on Friendly AI)"
Reply: Ben Goertzel: "RE: ITSSIM (was Some new ideas on Friendly AI)"
Reply: David Hart: "Re: ITSSIM (was Some new ideas on Friendly AI)"
Reply: Thomas Buckner: "HTML E-mail"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT