**From:** Matt Mahoney (*matmahoney@yahoo.com*)

**Date:** Mon Mar 10 2008 - 15:00:55 MDT

**Next message:**John K Clark: "Re: Is a Person One or Many?"**Previous message:**Heartland: "Re: Is a Person One or Many?"**In reply to:**Thomas McCabe: "Re: A formal measure of subjective experience"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

--- Thomas McCabe <pphysics141@gmail.com> wrote:

*> A great deal needs to be said here, but I'll just hit the high points.
*

*>
*

*> On Sun, Mar 9, 2008 at 4:15 PM, Matt Mahoney <matmahoney@yahoo.com> wrote:
*

*> > I propose the following formal measure of subjective experience. The
*

*> > experience of an agent observing event X is K(S2|S1) where S1 is the
*

*> state of
*

*> > the agent before observing X, S2 is the state afterwards, and K is
*

*> Kolmogorov
*

*> > complexity. In other words, the subjective experience is measured by the
*

*> > length of the shortest program that inputs a description of S1 and
*

*> outputs a
*

*> > description of S2.
*

*>
*

*> "Subjective experience" is an ill-defined concept (see
*

*> http://www.overcomingbias.com/2008/03/wrong-questions.html), and we
*

*> could argue about it for thousands of years and never get anywhere.
*

*> Isn't this exactly what philosophers have been doing, ever since the
*

*> days of ancient Greece?
*

Your concerns are valid. It is what happens when I try to formalize something

that doesn't exist.

*> > Conditional Kolmogorov complexity is therefore one possible measure.
*

*>
*

*> K complexity is hardly a sufficient metric for nontrivial properties
*

*> of Turing machines! Consider all Turing machines with n or fewer
*

*> states acting on a blank tape. The number of possible Turing machines
*

*> increases with C^N, so the number of possible nontrivial properties of
*

*> Turing machines increases with C1^C2^N (see
*

*> http://www.overcomingbias.com/2008/02/superexp-concep.html). K
*

*> complexity, meanwhile, increases with N. The amount of information
*

*> conveyable with K goes with log(N); the amount of information needed
*

*> as a metric for an arbitrary nontrivial property goes with
*

*> log(C1^C2^N) = C^N.
*

This is why we need inductive bias. A complexity measure is simple, therefore

appealing (as justified by AIXI). It allows for mathematical analysis.

*> > Applications.
*

*> >
*

*> > Some people believe that it is unethical to harm (kill or decrease
*

*> utility of)
*

*> > agents that have subjective experience. I do not take a position on this
*

*> > issue, but if we assume it is true, then:
*

*>
*

*> Beware using one Really Great Idea to explain absolutely everything
*

*> (http://www.overcomingbias.com/2007/12/affective-death.html). Human
*

*> morality is much more complex than this
*

*> (http://www.overcomingbias.com/2007/11/thou-art-godsha.html).
*

As I said, I don't assert that this model of ethics is correct.

*> > A data compression program like zip has subjective experience in all 3
*

*> modes
*

*> > that humans do. A compressor accepts a sequence of symbols from an
*

*> unknown
*

*> > source and has the task of predicting future symbols so that it can
*

*> assign
*

*> > shorter codes to the most likely symbols. It has procedural memory
*

*> because
*

*> > after each event (observing symbol X in some context), it raises the
*

*> > probability that X will occur next time the same context is observed. It
*

*> has
*

*> > episodic memory because decompression recalls the exact sequence of
*

*> events.
*

*> > It undergoes reinforcement learning with a utility function equal to the
*

*> > negative of the length of the compressed output.
*

*>
*

*> I haven't studied compression algorithms extensively, but you seem to
*

*> be playing fast and loose with the idea of "memory" and "utility
*

*> function" here. Compression algorithms, so far as I know, have no
*

*> explicit utility functions, and at least one (LZW) has no explicit
*

*> representation of probability.
*

LZW represents probability implicitly. It maintains a dictionary of phrases

which are used to replace matches in the input with dictionary codes. Its

implicit model, which includes a policy for adding and removing phrases, is

that all phrases are equally likely. You could still write an equivalent (but

less efficient) algorithm in the style of more advanced compressors that

explicitly separate modeling from coding.

Reinforcement learning stops looking like a utility function when you

understand the algorithm. For example, a thermostat looks like it has a goal

of keeping the room at a set temperature, until you look inside it.

I use data compression as an example because it is AI-complete. It is easier

to see the similarities to human intelligence.

http://cs.fit.edu/~mmahoney/compression/rationale.html

*> There's a big distinction between K complexity and bits of memory in
*

*> the normal sense. Assuming that the universe is a closed
*

*> Turing-computable system, it cannot have a K complexity significantly
*

*> higher than the K complexity at the time of the Big Bang.
*

Parts of the universe are more complex than the whole because you need to

specify the boundaries.

For example "enumerate all universes until intelligent life is found" is very

simple. Specifying the one we live in takes a few hundred more bits.

-- Matt Mahoney, matmahoney@yahoo.com

**Next message:**John K Clark: "Re: Is a Person One or Many?"**Previous message:**Heartland: "Re: Is a Person One or Many?"**In reply to:**Thomas McCabe: "Re: A formal measure of subjective experience"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

*
This archive was generated by hypermail 2.1.5
: Wed Jul 17 2013 - 04:01:02 MDT
*