Re: UCaRtMaAI paper

From: Tim Freeman (tim@fungible.com)
Date: Sat Nov 24 2007 - 14:15:04 MST


From: "Wei Dai" <weidai@weidai.com>
>My conjectured-to-be-better scheme is to not build an AGI until we're more
>sure that we know what we are doing.

Who is this "we"? By chance, I know of four AGI projects that seem to
be making reasonable progress without any concerns about friendliness.
I stumbled across four without making any effort to search for them,
so there are surely more out there. For different values of "we", you
might be proposing to stop them all, or just the hypothetical one
based on the ideas in my paper, or perhaps some other subset.

>I made the suggestion of giving each person a fixed quota of resources. Is
>that something you've considered already?

That's a plausible idea. It seems suboptimal, since if person A is in
one aisle of the grocery store choosing apples, and person B is in the
next aisle is having a heart attack, I would prefer a scheme in which
the person B gets more resources than person A. Offhand, I don't know
how to define "resources" so the idea is implementable, but there
might be something doable there.

Another idea that came up recently is that the FAI might require each
person to indicate what they want by doing some fixed percentage of
the work to get it, say 10%, then the AI does the other 90%. As the
AI gets more powerful it can do a larger fraction of the work, and as
it gets to know more about human nature it will be able to accurately
guess what people want with them contributing less of the effort. The
UCaRtMaAI proposal can, in principle, choose to do experiments, so it
might figure out something like this itself. Making someone show
their enthusiasm by working toward the goal before you help them isn't
different in kind from paying attention to what they say they want --
in both cases some symbolic action is interpreted as a clue about the
utilities.

>...You might want to walk through an example step by step...

That's a good idea. When I've done it, I'll add it to the paper and
post a pointer.

>To take the simplest example, suppose I get a group of friends together and
>we all tell the AI, "at the end of this planning period please replace
>yourself with an AI that serves only us." The rest of humanity does not know
>about this, so they don't do anything that would let the AI infer that they
>would assign this outcome a low utility.

I'll make a separate post for this, since I'll have a question there
that might merit a wider audience.

>Among the infinite number of algorithms for averaging people's utility
>functions, you've somehow picked one. How did you pick it?

I didn't have any better ideas, and this one was implementable and
seemed to work well in the cases I could imagine.

>Given that the vast majority of those algorithms are not among the
>best known alternatives, what makes you think that the algorithm you
>picked *is* among the best known alternatives?

I'm sure it's among the best alternatves known *to me*, simply because
I'm ignorant of better alternatives.

I know that's not worth much. The main point of the paper isn't "this
is an excellent algorithm we should use". The main point is "here's
an example of how to use Solomonoff induction to specify AI systems
that interact constructively with the real world". If someone finds
something better to plug in there, that's great. It's an unambiguous
specification of a better-than-nothing algorithm, and we didn't have
that before, to my knowledge.

>For example, consider explicit calibration as an alternative. Design a
>standard basket of goods and services, and calibrate each person's utility
>function so that his utility of obtaining one standard basket is 1, and his
>utility of obtaining two standard baskets is 2.

That's a plausible idea, if it could be implemented. How could it be
implemented?

I imagine some odd mechanical man with a basket containing a tube
of toothpaste, a loaf of bread, and a one gallon jug of crude oil
walking up to a hut in rural Uganda and saying "What would you be
willing to give me in exchange for this?" The inhabitants have never
seen any of those things before, so the subsequent conversation
doesn't seem likely to go well. There must be a better idea than
measuring their value of a basket of goods & services by making them
trade for it, but I don't know what it is.

In general, measurement by trading doesn't work for people who aren't
hooked into the world economy, and it doesn't work well when
transaction costs are large compared to the value of the basket. If
you control transaction costs by having a very large basket, it seems
you're likely to misunderstand the low-wealth end of the curve, such
as starving people and kidnapped children. (Think of the chidren! :-)

>To me, this seems a lot more likely to be at least somewhat fair than
>an algorithm that relies on the side effects of integer overflow.

You may be right about the calibration scheme being better, if it can be
implemented. However, let's be fair about my scheme -- it doesn't
play with integer overflow. If the utility is too big, we discard the
explanation rather than truncating the high bits and using the low
bits.

My scheme imposes a maximum utility and finite resolution. If utility
is a model of some neurological event, this is plausible because the
human brain is a physical system of finite complexity. (This might be
wrong if Penrose is right about human consciousness using quantum
entanglement, but I'm not worried about that. If we were quantum
computers, we'd be smarter.)

-- 
Tim Freeman               http://www.fungible.com           tim@fungible.com


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT