Re: [sl4] Bayesian rationality vs. voluntary mergers

From: Wei Dai (weidai@weidai.com)
Date: Mon Sep 08 2008 - 13:36:04 MDT


Tim Freeman wrote:
> For what it's worth, the period of time when the merged AI is acting
> irrationally is very short. Before it gets all of the assets from A
> and B, it's waiting to get all of the assets. After it gets all of
> the assets from A and B, it flips the unfair coin and after that point
> it's rational. So the only period of time when its irrationality is
> observable is when it's choosing to flip the coin and commit to one
> plan or the other based on the result rather than simply pursue the
> plan that is more likely to succeed.

It's not true that the period of "irrationality" (I put it in quotes because
it's irrational according to standard decision theory, but not according to
common sense) has to be short. Suppose the merged AI starts trying to
convert the universe to paperclips based on a coin toss, but after doing 10%
of the universe, realizes that the staples goal has a much higher chance of
success, which it didn't know at the beginning. I think the two original AIs
would have agreed that in this circumstance the merged AI should flip
another coin to decide whether or not to switch goals. The coin would be
biased, with the probabilities for the two goals linked to their chances of
success. This new coin toss would again be considered "irrational" under
standard decision theory. Also, just the possibility that the merged AI may
have to switch goals in the future will affect everything it does now, so
it's likely to deviate from standard decision theory throughout its life.

> 2. Even if they were, the sort of negotiation you describe would have
> to happen at some point, and there's some inevitable irrationality in
> the middle of that.
>
> Thanks for the insight.

Thanks, but that isn't exactly the insight I was hoping to deliver. :) I
think a merged AI, or a Friendly AI that is supposed to benefit a whole
society, may have to somehow embed the negotiation process within itself
forever, and there may never be a "return to rationality".



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT