From: Benja Fallenstein (firstname.lastname@example.org)
Date: Mon May 18 2009 - 20:34:29 MDT
On Sun, May 17, 2009 at 10:57 PM, Matt Mahoney <email@example.com> wrote:
> I defined an agent that believes it is immortal as one that has the goal of maximizing accumulated reward r(t) for t from now to infinity. If instead an agent believes it will die at time T, then it would rationally have the goal of maximizing accumulated reward summed from t to T. For a known environment, it would behave the same regardless of r(t) for all t > T.
> The question is whether you can create an environment that can distinguish the two cases. Assume the agent knows the environment you choose and is rational, i.e. it is able to optimally solve the problem of accumulating reward over its (expected finite or infinite) lifetime for your chosen environment. You do not know T.
On day 1, give the agent a choice between two contracts, A and B. A
pays $1 every day until the end of time. B offers no recurrent
payments, but on every successive day the agent can terminate the
contract; if the agent terminates the contract on day n, it receives a
one-time payment of $(n+1000).
All the best,
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:04 MDT