Re: ESSAY: How to deter a rogue AI by using your first-mover advantage

From: Stathis Papaioannou (
Date: Sun Aug 26 2007 - 22:38:15 MDT

On 27/08/07, Norman Noman <> wrote:

> Are you honestly telling me we're going to see a televangelist saying "Give
> me your money, and your soul will go to heaven! Simulated heaven, inside a
> computer. Here in the real world, heaven and hell don't exist. Hallelujah!"

The religious fanatics are probably wrong, but unfortunately they're
not all stupid. It's not even unthinkable that they may come to be the
dominant force post-singularity. The rationalization would be that
there is one real world (with a real God, heaven and hell), and the
best way to get as many converts as possible in the real world would
be to make sure that everyone is aware they will set up a simulated
world with simulated heaven and hell when the resources become
available. Their main problem would then be to ensure that the
movement remains active in some form until the technology becomes
available to run the desired simulation. The probability that some
member of the movement will succeed at some point in the future of the
universe will then determine the probability that you are now in the
simulation. If the movement further stipulates that the simulation
will be recursive - simulations within simulations - you could argue
that you are almost certainly in one of these simulations.

> I notice you didn't respond to this part at all, so perhaps I should
> elaborate on it. There's no reason to expect the church, or PETA, or the ice
> cream council, to have the capacity to simulate universes. Now, or at any
> point in the future.

At some point in the future, it will be relatively simple to simulate
the world we are living in at present. That's the main tenet of the
simulation argument.

> I suppose big tobacco could be secretly making an AI in some shady back
> room, in order to run a simulation where smokers go to heaven and everyone
> else dies of lung cancer, but if you keep such a thing a secret it's useless
> and completely insane, like the doomsday device in dr. strangelove.

Well, prior to this thread the idea had not occurred to them.

> And the thing is, it's not going to work either way. If they reveal the
> plan, and say SMOKE OR DIE! it's only going to make everyone hate them even
> more. Actually, it's only going to make everyone laugh at them and think
> they're nuts, but assuming they were taken seriously for some reason...

The religious people would convince the faithful that they were doing
God's work, as explained above. But even if it's tobacco companies,
the fact that it's obviously an evil threat doesn't make it any less
likely to be true. And if it actually got to the point where
legislation was passed to make this sort of thing illegal, that would
be the ultimate proof that people were taking it seriously.

> > We might not still be fighting the battle, because we might be in a
> > simulation run by the God-schmucks (or whoever). You can't tell it's a
> > simulation, that's the point.
> We're still fighting in the real world piece of the probability pie, which
> is inseparable from the fake one, and whether or not we win determines which
> piece the final pie is made of. The RAI doesn't have this leverage, it's a
> simulation of one possible future being run by another possible future, not
> a simulation of the past leading to the future that's simulating it.

With the RAI, the decision to destroy humans may have negative
consequences or positive to neutral consequences. With my scenario,
the decision to oppose the religious group may have positive
consequences or negative consequences, depending on whether you are in
a simulation. On the ther hand, the decision not to destroy humanity
or the decision not to oppose the religious group would have only
minor negative consequences, if the situation is set up appropriately.

> > > B. Two-way interaction. In rolf's plan, the simulated and nonsimulated
> RAIs
> > > are in separate branches and thus the real RAI has no way to stop itself
> > > from being simulated.* In your thing, we can go to the house of whoever
> says
> > > JOIN ME OR DIE and punch them in the face.
> >
> > You can do that, but if it turns out that you're in the simulation
> > you'll be sent to hell for your trouble.
> If you make sure they never build their simulation, then you won't be in it.

But you don't know that you're preventing the simulation, rather than
cutting your own throat. And in any case, it is very difficult to
prevent something from happening anywhere, ever, which is what you
would have to do. You could wipe out all the Christians and they might
arise again in a thousand years to complete their predecessors' work.

> > By cooperating and trying to spread the good word you maximise the
> > chances that you and everyone else will go to heaven, at minimal cost
> > - sort of like Pascal's wager.
> Not quite. Under pascal's wager there's nothing we can do to change the
> existence of god. He's either there or he's not. Here we can work toward one
> option or the other. By cooperating, you may save yourself, but you fund
> god's existence and in so doing, help to doom everyone who doesn't
> cooperate. This is not minimal cost.

It's at minimal selfish cost. If you actually believe the stuff, it's
also the altruistic thing to do. Infidels will go to hell anyway, and
this project maximises the number of souls that are saved.

> And even the "minimal cost" of praying or lending a futuristic mafia ten
> bucks is still much more significant than the cost to the RAI in rolf's
> deterrent plan.

Maybe, but it requires a very complicated and perhaps impossible
cost-benefit analysis.

> > Of course, the problem with Pascal's
> > wager is that there are so many actual and possible religions that you
> > may as well ignore all of them, unless you find good evidence that a
> > particular one is true - which translates to evidence that you are in
> > fact in a simulation.
> If I am an RAI, the evidence that I may be in a simulation is as follows:
> "The people who created me were trying to create a FAI. With more luck, they
> would have succeeded. The FAI they created would run karmic simulations of
> likely RAIs that could have been created instead.
> Ergo, the probability I am one of them is XY/(1-X)Z, where X is the
> likelihood they'd create a friendly AI rather than a rouge one, Y is the
> likelihood a friendly AI would simulate me specifically, and Z is the
> likelihood I would be created as a result of real human error."

These are difficult things to reason about. What about the possibility
that you or the RAI might be in a recursive simulation?

Stathis Papaioannou

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:58 MDT