From: Johnicholas Hines (firstname.lastname@example.org)
Date: Sun Oct 25 2009 - 17:35:13 MDT
Dr. Mahoney makes a solid argument that "friendly" is not defineable
via the vague terms "human" or "happiness". However, there might be a
distinction to be made between "defineable" and "indexible".
For example, suppose that someone has the ability to perform a
particular computation. (In fact, the computation is squaring.)
However, they do not have the ability to name or define what they are
doing. They write down a large set of input-output pairs and say
"Here, I believe there is a concise, formal definition of this
computation that I am doing. Please tell me what it is."
They have not defined the operation of squaring. However, with enough
datapoints, and the criterion that the solution should be concise and
formal, the definition of squaring becomes extremely salient. The
interesting thing is that with enough datapoints, even if the human
sometimes makes mistakes in computation, the squaring definition still
becomes salient. Presumably, shown the definition and its qualities,
the human would agree that yes, those datapoints were mistakes.
My understanding is that Friendliness is supposed to be analogous, and
that we believe that there is a way to resolve the vague notions of
"human" and "happiness" into alternative precise notions that we would
agree capture the original intent. (E.g. we don't actually care about
"human" but rather "foo" and "bar", which some but not all computer
programs or aliens have.) One of the problems is that we anticipate
that those alternative precise notions will be quite large, possibly
larger than any human presently alive could know.
Note: I'm alluding to the field of "Inductive Logic Programming" here.
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:05 MDT