The chinese finger trap

From: Rebecca (
Date: Sat Apr 26 2008 - 14:02:50 MDT

Remember rolf's idea from a while back, the cooperation with hypothetical
simulators via newcomb's paradox such and such theory?

It recently occurs to me that if this thinking is true, it may actually be
better NOT to attempt friendly AI.

As was noted earlier, if friendliness is something that will crop up
spontaneously through cooperation with a hypothetical average of possible
simulators, some of which are friendly via being correctly programmed, then
it naturally follows FAI isn't necessary. What went unnoticed is that it may
actually be more dangerous to attempt it than not to.

If we make an unfriendly AI that wants to make pancakes, and doesn't care
about humanity, then it will (theoretically) cooperate with possible
simulators by being friendly anyway. It's no skin off its nose if humanity's
problems get solved and they live in blissful ascension and get free
t-shirts, as long as they don't get in the way of pancake production much.

But if we try to make a friendly AI, and we get it a bit wrong, such that,
say, it's built to eliminate suffering but its definition of suffering is
too extensive and includes virtually all forms of thought, then it might
very well NOT cooperate, because letting humanity survive is in direct
violation of its primary goal.

So, it seems to me we have a chinese finger trap situation, where it's safer
to make an indifferent AI that will have nothing against cooperating with
possible higher friendly AIs than it is to try to make an actual friendly AI
and fail.

As things currently stand, I think this is something that should be taken
very seriously.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT