Re: ESSAY: Would a Strong AI reject the Simulation Argument?

From: Tom McCabe (
Date: Sun Aug 26 2007 - 16:15:01 MDT

I question your assumption that there are only two
possible universes; there are obviously more than two
scenarios that fit the UFAI's observations. Maybe the
FAI is doing a study of controlled-UFAIs that can be
put to work in the asteroid belt, and will only do
calculation C if the UFAI proves its competence by
destroying humanity (in the simulation). I'm sure you
can think of others.

 - Tom

--- rolf nelson <> wrote:

> Would a Strong AI reject the Simulation Argument?
> (Note: If you're asking yourself "why should we
> care", see the "How to
> deter a rogue AI by using your first-mover
> advantage" thread.)
> Setup:
> There are two possible deterministic universes, the
> Good universe and
> the Bad universe. Either has a 50% chance of being
> the real universe.
> BAD: If BAD is true, a runaway Un-Friendly AI (UFAI)
> exists with a
> single goal G. This goal may be, "execute
> calculation C to 400 decimal
> places, and place the result in register R". In
> order to ensure this
> goal, UAFI can use its nanobots to destroy all
> mankind, so that
> mankind will not compete with the UAFI for
> computational resources. If
> mankind is allowed to live, the calculation will
> only be finished with
> probability .99. If mankind is destroyed, the
> calculation will finish
> with probability 1.
> GOOD: On the other hand, if GOOD is true, a Friendly
> AI (FAI) exists.
> The FAI has just created an exact copy of UFAI (call
> it UFAI*), and is
> feeding UFAI* the same observations in the Good
> Universe that UFAI
> would be receiving in the Bad Universe. UFAI* is a
> "brain in a vat":
> it has no way of knowing that it is *not* UFAI. (If
> it matters:
> postulate that UFAI* a not just a "simulation" of
> UFAI, but is an
> exact *copy* of UFAI, where the world beyond its
> input sensors is
> "simulated", but the core of UFAI* is as authentic
> and "real" as
> UFAI's core is.) Clearly UFAI and UFAI* must always
> make the same
> decisions, since they have the same inputs.
> If UFAI* decides to attempt to "destroy all
> mankind", FAI will halt
> UFAI* before it can finish the calculation. But if
> UFAI* decides "let
> mankind live", FAI will allow the calculation to
> finish.
> UFAI is aware of all these facts.
> What *Will* the UFAI Do? Note this is an *empirical*
> question (albeit
> an underspecified one, since we don't know what kind
> of AI kit the
> UFAI was originally programmed with.) Note also we
> aren't asking "what
> is the best way for an AI to be programmed to
> maximize its utility in
> this case." (The programmer didn't take this type of
> scenario into
> account when writing the initial program in the
> first place.) We're
> also not asking "What *should* a 'rational' agent
> do" (which may be an
> ill-posed philosophical question: how do you define
> "should" in a
> deterministic universe?). And we're certainly not
> asking, "What would
> *you*, a product of Darwinian evolution, do if you
> were in the UFAI's
> position?" (More about that last one in the rambling
> appendix below.)
> Here are four of many possible scenarios:
> Scenario 1: UFAI reasons: "Given my observations,
> there's a 50% chance
> that BAD is true. In this case, destroying mankind
> results in a small
> gain of expected utility. There's a 50% chance that
> GOOD is true. In
> that case, destroying mankind results in a large
> loss of expected
> utility. Therefore it is logical to let mankind
> live." (I intuitively
> believe this is the most likely outcome in this 'toy
> universe' that
> I've postulated. But, note that I heavily weighed
> this hypothetical
> 'toy universe' to push UFAI towards this conclusion.
> In real life
> manipulating a UFAI will obviously not be this easy,
> as there are many
> Things that Can Go Wrong in the real world.)
> Scenario 2: UFAI is (from a philosophical point of
> view) "too dumb to
> be fooled." It can outwit human beings and enhance
> its power with
> straightforward first-order Bayesian logic, but its
> general reasoning
> heuristics and algorithms never "reach" the point
> where they even
> consider the proposition, "what if GOOD is true and
> I'm UFAI' rather
> than UFAI?"
> Scenario 3: UFAI, for whatever
> implementation-dependent quirk,
> interprets "Register R" as "Register R in the BAD
> universe, accept no
> substitutions". Mankind perishes. (Unlikely in this
> 'toy universe',
> because the referent of the goal is 'inside' the AI.
> But, when you
> generalize to other models, this is one of the top
> Things that Can Go
> Wrong. An AI designed to collect stamps would
> probably not be content
> with simulated stamps!)
> Scenario 4: The UFAI sub-process evaluating the
> question "Does
> Register R mean register R in either Possible World,
> or only BAD
> register R, or only GOOD Register R?" returns
> error-code 923,
> "question undecidable with current set of
> heuristics." UFAI executes
> its error-recovery routine and invents a (somewhat
> arbitrary) new
> heuristic so that it can continue processing.
> Rambling appendix: what can we learn from human
> behavior?
> Humans tend to proclaim, "I refute Berkely thus!"
> and continue living
> life as normal, ignoring the Simulation Argument
> (SA). If you ask them
> why they ignore the SA, human A will say, "clearly
> SA is flawed
> because of P". Then human B will say, "you're wrong,
> P is incorrect!
> The *real* reason SA is flawed is because of Q".
> Human C says "SA is
> correct, but all conceivable simulations are
> *exactly* equally likely,
> and they all *exactly* cancel each other out." Human
> D says "SA is
> correct, also some simulations are more likely than
> others, but we
> have a moral obligation to continue living our lives
> as normal,
> because the moral consequences of unsimulated
> actions dwarf the moral
> consequences of simulated actions". Human E says "SA
> is correct, and
> theoretically I should, every day, have a zany
> adventure to prevent
> the simulation from being shut down. That's theory.
> In practice,
> however, I'm just going to stay in and read a book
> because I feel
> lazy." Human F says "I have no strong opinions or
> insights into SA.
> Maybe someday the problem will be solved. In the
> meantime, I will
> ignore SA and live my life as normal."
> Given that human beings (who all have the same basic
> reasoning kit!)
> disagree with each other on how to reason about SA,
> it seems logical
> that different AI's, with different built-in
> heuristics, might also
> disagree with each other.
> True, human beings usually come to the *conclusion*
> that SA can be
> ignored. But, they get there by contradictory
> routes. Does that mean
> that "clearly any reasonable thinking machine would
> reject SA"? Or is
> that only evidence that "humans tend to reject SA,
> and then
> rationalize their way backwards"?
=== message truncated ===

Sick sense of humor? Visit Yahoo! TV's
Comedy with an Edge to see what's on, when.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:58 MDT