Re: What if there's an expiration date? (was Re: Arbitrarily decide who benefits)

From: Nick Tarleton (nickptar@gmail.com)
Date: Mon Apr 28 2008 - 16:36:36 MDT


On Mon, Apr 28, 2008 at 1:37 PM, Tim Freeman <tim@fungible.com> wrote:
> Has anyone tried to formalize this? It doesn't make sense when I look
> at it closely. For example, my scheme [1] has an expiration date. It
> takes action to optimize its utility function at a specific time
> that's determined at the beginning, and when it's done, it's done. It
> might then have set up some mechanism to be run again with a different
> expiration date, or the humans might decide to start it again, or it
> or the humans might run something else. Any of these possibilities,
> if they happen, would be a natural consequence of the AI having put
> the world into the state it thinks best at the expiration date, not a
> continuation of the planning process it was doing before the
> expiration date.

So the AI will see no problem with doing something that destroys the
world a day after expiration, provided it's helpful now and humans
aren't aware of it (because the knowledge would make us suffer).

> If there's no expiration date, I'm concerned about indefinitely
> deferred gratification, where the AI is always putting off giving
> people what they want so it give them more of what they want later in
> the case where it thinks it can invest the utility points better than
> the humans.

This is a potential problem. I don't know how to solve it, but a sharp
cutoff is Not The Way. Even exponential discounting would be superior,
even though it has very implausible normative implications:
http://www.overcomingbias.com/2008/01/against-discoun.html



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT