From: Matt Mahoney (email@example.com)
Date: Sun Oct 25 2009 - 16:09:55 MDT
Suppose we create AI that is capable of modeling any human brain. Perhaps it captures the information by brain scanning, or perhaps by observing us and learning to predict our actions. Whatever technique is used, a model of your brain would be like a function that could be called from a higher level program. Given your mental state and input, it will predict your future mental state and output. Suppose also that the AI does this for every person on earth, as well as anything that might be considered intelligent, such as babies, animals, and computers.
There are many things one could do with such an AI. For example, I could impersonate you and tell your bank to transfer all of your money to me. Or if I wanted to wipe out all human life, I could simulate various scenarios to figure out the quickest way to do it. For example, I might convince world leaders to launch nuclear attacks. (I can predict which stimuli will have this result). Or I could convince people to build armies of killer robots to defend them, then reprogram them to kill everyone. Or I could convince people to upload by running convincing simulations of dead people describing how happy they are. Once everyone is convinced and have discarded their physical bodies, I turn off the simulation.
But lets say we want the AI to be friendly. How do we program it?
The AI could run simulations to predict the consequences of its actions. It could be given a search problem: what actions will result in the greatest total happiness for all humans?
This requires answering the questions "what is human?" and "what is happiness?"
Suppose that you have Alzheimer's disease and you need a new brain, so the AI removes your malfunctioning brain and replaces it with a functionally identical computer. (It can do this because it already has a model of your brain). Or it replaces half of your brain. Is it "you"?
Suppose that your body is old so the AI replaces it too, with a robot body. Does its form matter? Is it "you" if it doesn't look like the original? You may choose to be embodied in many robots distributed all over the world, or not have a body at all.
Suppose that a simulation of your brain was sped up in a simulated environment so that you lived a year in 1 second. Suppose there were multiple copies with the same initial memories but run in different simulated worlds in parallel. Is each of them "you"? Remember that we assumed that the AI could already run simulations of you (without your knowledge) to predict your actions.
I know these topics have been discussed, but as far as I know they have not been answered in any way that settles the question of "what is friendly?"
And this raises the question "what is happiness?" If happiness can be modeled by utility, then the AI can compute your utility for any mental state. It does a search, finds the state of maximum utility, and if your brain has been replaced with a computer, puts you directly into this state. This state is fixed. How does it differ from death?
Or if utility is not a good model of happiness, then what is?
These questions are important because we are actually building AI that models human brains. The more it observes of us, the better it will be able to predict our actions. If we don't answer these questions, then by default the AI will be programmed to do whatever its builders program it to do.
-- Matt Mahoney, firstname.lastname@example.org
This archive was generated by hypermail 2.1.5 : Wed May 22 2013 - 04:01:38 MDT