RE: Humane-ness

From: Ben Goertzel (ben@goertzel.org)
Date: Tue Feb 17 2004 - 14:39:54 MST


> We'd better make damn sure that the AGI we end up with does not lack
> the structural capability to represent those "highly complex and messy
> networks of beliefs", and renormalize them. If we can't guarantee
> that the AGI will remain humane under both state-of-ethics scenarios,
> then we should continue to improve our design. The forementioned
> messy belief network itself is not necessarily the goal, but its
> structural accessibility to the AGI as a fundamental consideration is
> a goal.
...
> -Chris Healey

My contention is that there is almost surely NO WAY to guarantee that an AGI
will remain "humane" under radical iterated self-modification, according to
Eliezer's (not that clear to me) definition of "humane"

This is a conjecture about WHAT SORT OF PROPERTIES are going to be
potentially stable under radical iterated self-modification

I contend that

-- specific moral injunctions like "Be good to humans"
-- messy, complex networks of patterns like "humane-ness"

will probably NOT be stable in this sense, but more abstract and crisp
ethical principles like

-- Foster joy, growth and free-choice

probably will be.

This conjecture is based on my own mathematical intuition about
self-modifying dynamical systems, and it could certainly be wrong.

I think that playing with simple self-modifying goal-driven AI systems will
teach us enough about the dynamics of cognition under self-modification,
that we'll get a much better idea of whether my conjecture is correct or
not.

Please note, I am not even arguing as to whether the goal of making
humane-ness survive the Transcension in a self-modifying AI is DESIRABLE. I
am arguing that it is probably IMPOSSIBLE. So I am focusing my attention on
goals that I think are POSSIBLE, along with doing the research needed to
better assess what is really possible or not.

-- Ben G



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:45 MDT