Re: ITSSIM (was Some new ideas on Friendly AI)

From: David Hart (dhart@atlantisblue.com.au)
Date: Mon Feb 21 2005 - 23:12:09 MST


Hi Ben,

I understand how ITSSIM is designed to "optimize for S", and also how
it might work in practice with the one of many possible qualitative
definitions of "Safety" being the concept that if we [humans] desire
that our mind-offspring respect our future "growth, joy and choice", the
next N+1 incrementally improved generation should want the same for
themselves and their mind-offspring.

In such a system, supergoals (like, e.g., CV) and their subgoals,
interacting with their environments, generate A (possible actions), to
which R (safety rule) is applied.

I'm very curious to learn how S and SG might interact -- might one
eventually dominate the other, or might they become co-attractors?

Of course, we're still stuck with quantifying this and other definitions
for "Safety", including acceptable margins.

NB: I believe we cannot create an S or an SG that are provably
invariant, but that both should be cleverly designed with the highest
probability of being invariant in the largest possible |U| we can muster
computationally (to our best knowledge for the longest possible
extrapolation, which may, arguably, still be too puny to be comfortably
"safe" or "friendly").

Perhaps the matrix of S/_B /, S/_E /, S/_N / and SG/_B /, SG/_E /,
SG/_N /should duke-it-out in simulation. Although, at some point, we
will simply need to choose our S and our SG and take our chances, taking
into account the probability that Big Red, True Blue, et al, may not
have terribly conservative values for S or SG slowing their progress. :-(

David



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT