RE: Novamente goal system

From: Ben Goertzel (
Date: Sat Mar 09 2002 - 22:23:37 MST

> Just to make sure
> we're all on the same wavelength, would you care to describe briefly what
> you think a transhuman Novamente would be like?

Eliezer, this is a hard question, which would require a long and complex
answer to do it any justics.

The short answer is that a transhuman Novamente could look like a *lot* of
different things, depending on a lot of different factors.

My "default" vision of a transhuman Novamente assumes a Novamente that
achieves transhuman intelligence without having a physical robot body to
control, or humanlike sensors (camera eyes, etc.). However, I don't know
how to estimate the probability that this "default" vision will be the one
that comes about. I like this default vision emotionally, because it fits
in with a more optimistic timeline, in that robotics and humanlike sensors
are not things I'm currently working on.

Let me roughly describe my idea of the FIRST transhuman AI, in this default

This first (mildly) transhuman Novamente will communicate with us in
comprehensible and fluent but not quite human-like English. It will be a
hell of a math and CS whiz, able to predict financial and economic trends
better than us, able to read scientific articles easily, and able to read
human literary products but not always intuitively "getting" them. It will
be very interested in intense interactions with human scientists on topics
of its expertise and interest, and in collaboratively working with them to
improve its own intelligence and solving their problems. It will be
qualitatively smarter than us, in the same sense that you or I are
qualitatively smarter than the average human -- but no so much smarter as to
have no use for us (yet)....

How long this phase will last, before mild transhumanity gives rise to
full-on Singularity, I am certainly not sure.

> I claim: There is no important sense in which a cleanly causal,
> Friendliness-topped goal system is inferior to any alternate system of
> goals.
> I claim: The CFAI goal architecture is directly and immediately
> superior to
> the various widely differing formalisms that were described to me by
> different parties, including Ben Goertzel, as being "Webmind's goal
> architecture".
> I claim: That for any important Novamente behavior, I will be able to
> describe how that behavior can be implemented under the CFAI architecture,
> without significant loss of elegance or significant additional computing
> power.

These are indeed claims, but as far as I can tell they are not backed up by
anything except your intuition.

I am certainly not one to discount the value of intuition. The claim that
Novamente will suffice for a seed AI is largely based on the intuition of
myself and my collaborators.

However, my intuition happens to differ from yours, as regards the ultimate
superiority of your CFAI goal architecture.

I am not at all sure there is *any* goal architecture that is "ultimate and
superior" in the sense that you are claiming for yours.

And I say this with what I think is a fairly decent understanding of the
CFAI goal architecture. I've read what you've written about it, talked to
you about it, and thought about it a bit. I've also read, talked about, and
thought about your views on closely related issues such as causality.

Sometimes, when the data (mathematical or empirical) is limited, there is
just no way to resolve a disagreement of intuition. One simply has to
gather more data (via experimentation (in this case computational) or
mathematical proof).

I don't think I have a big emotional stake in this issue, Eliezer. I never
have minded integrating my own thinking with that of others. In fact I did
a bit too much of this during the Webmind period, as we've discussed. I'm
willing to be convinced, and I consider it possible that the CFAI goal
architecture could be hybridized with Novamente if a fair amount of work
were put into this. But I'm not convinced at the moment that this would be
a worthwhile pursuit.

> To be precise: A correctly built Friendly AI is argued to have at least
> that chance of remaining well-disposed toward humanity as would
> be possible
> for any transhuman, upload, social system of uploads, et cetera.

This statement is conceptually (though not rigorously, perhaps) a corollary
of your claim that the CFAI goal system is intrinsically superior to all
others. As I don't accept the former claim, I don't accept this one either.

> > This will cause it to seek Friendliness maximization avidly,
> and will also
> > cause it to build an approximation of Yudkowsky’s posited
> hierarchical goal
> > system, by making the system continually seek to represent
> other goals as
> > subgoals (goals inheriting from) MaximizeFriendliness.
> No, this is what we humans call "rationalization". An AI that seeks to
> rationalize all goals as being Friendly is not an AI that tries to invent
> Friendly goals and avoid unFriendly ones.

Unfortunately you have misunderstood my suggestion. Perhaps this was
inevitable given the brevity and out-of-context nature of the snippet I

As I understand it, "rationalization" is when a system has decided what it
wants to do, and then makes up "false reasons" to justify its desired

In the most common case: a system really is pursuing goal G1, and chooses
action A because it judges A will lead to satisfaction of G1. But it thinks
it *should* be pursuing goal G2 instead. So it makes up reasons why A will
lead to satisfaction of G2. Usually the term "rationalization" is used when
these reasons are fairly specious.

What I am talking about is quite different from this. I am talking about:

--> Taking a system that in principle has the potential for a goal
architecture (a graph of connections between GoalNodes) with arbitrary

--> Encouraging this system to create a goal architecture that has a
hierarchical graph structure with Friendliness at the top

I don't see how this is the same as "rationalization." Perhaps you mean
something very different than I do by "rationalization", however. Please

What I am saying is that Novamente's flexible goal architecture can be
*nudged* into the hierarchical goal architecture that you propose, but
without making a rigid requirement that the hierarchical goal structure be
the only possible one.

I believe that if the system builds the hierarchical goal structure itself,
then this hierarchical goal structure will coevolve with the rest of the
mind, and will be cognitively natural and highly functional. I don't think
that imposing a fixed hierarchical goal structure and rigidly forcing the
rest of the mind to adapt to it (the essence of the CFAI proposal, though
you would word it differently), will have equally successful consequences.

> I claim: That for any type of error you can describe, I will be able to
> describe why a CFAI-architecture AI will be able to perceive this as an
> "error".
> It's not an intuition. It's a system design that was crafted to
> accomplish
> exactly that end.

But your argument in favor of your claim is basically intuition, Eliezer.

You may have designed your system to achieve this end -- but I do not
believe your system will in fact achieve this end.

> > However, this would be very, very difficult to ensure, because
> every concept
> > in the mind is defined implicitly in terms of all the other
> concepts in the
> > mind. The pragmatic significance of a Friendliness FeelingNode
> is defined
> > in terms of a huge number of other nodes and links, and when a Novamente
> > significantly self-modifies it will change many of its nodes and links.
> > Even if the Friendliness FeelingNode always looks the same, its meaning
> > consists in its relations to other things in the mind, and these other
> > things may change.
> That's why a probabilistic supergoal is anchored to external referents in
> terms of information provided by the programmers, rather than
> being anchored
> entirely to internal nodes etc.

Yes, we agree on this, although we use slightly different language to
describe the same thing.

This point is valid whether one has a rigidly enforced hierarchical goal
structure as you advocate, or a more flexible goal structure as in the
current Novamente design.

> If Novamente were programmed simply with a static set of supergoal content
> having only an intensional definition, then yes, it might drift very far
> after a few rounds of self-modification. This is why you need the full
> Friendliness architecture.

No: this is why you need the system to interact with humans as it grows and
To me, this implies nothing about the need for a rigid hierarchical goal

> The rest of the above statement, as far as I can tell, represents two
> understandable but anthropomorphic intuitions:
> (a) Maintaining Friendliness requires rewarding Friendliness. In humans,
> socially moral behavior is often reinforced by rewarding individually
> selfish goals that themselves require no reinforcement. An AI, however,
> should work the other way around.

I did not say this and do not agree with this. My statement was rather that
maintaining a concept of Friendliness close to the human concept of
Friendliness *may* require continual intense interaction with humans. This
says nothing about reward or punishment, which are very simplistic and
limited modes of interaction anyway.

> (b) Novamente will be "socialized" by interaction with other humans.
> However, the ability of humans to be socialized is the result of
> millions of
> years of evolution resulting in a set of adaptations which enable
> socialization. Without these adaptations present, there is no reason to
> expect socialization to have the same effects.

I did not say and do not believe that socialization will have the same
importance or effects for Novamentes (or other AI)'s as for humans.

I think it will serve a related but different purpose for Novamentes than
for humans -- but still an important purpose. I can give more details on
this later, I don't have time tonight!

I find that sometimes you use the term "anthropomorphic" as a kind of
generic dismissal of arguments with which you disagree. There are going to
be some similarities and some differences between human minds and AI minds.
Of two positions about AI, the one that emphasizes similarity with humans
more greatly is not always going to be wrong. "More anthropomorphic" does
not intrinsically mean "less true." I realize that you do not explicitly
say or believe anything as simplistic as "more anthropomorphic means less
true", but you often give the impression that you feel/think this way.

Excessive anthropomorphism is a common error in thinking about AI's, of
which we are both quite aware.

However, it is also a common error to make overly strong assumptions about
the *rationality* and "logical" nature of AI's. I am afraid that your
intuitions underlying your claims about the CFAI goal architecture sometimes
fall into this trap. I feel your intuition doesn't always adequately
appreciate the self-organizing and unpredictable nature of mind.

You may say that this is an anthropomorphism, that digital minds won't be
"self-organizing and unpredictable" to the extent that human minds are. You
could be right, but here our intuitions just differ. My intuition is
somewhat influenced by a broad study of natural complex systems, and I
recognize that an AI is not a natural complex system, it's an engineered
complex system. But I have a feeling that intelligence requires
self-organizing complexity, which implies unpredictability, which makes the
kind of rigid goal that structure you propose unworkable, and makes the
intuition underlying your claims for the CFAI goal architecture seriously


Overall, I think the problem with this long-running argument between us is:

1) You don't really know how the Novamente goal system works because you
don't know the whole Novamente design

2) I don't really know how your CFAI system would work in the context of a
complete AI design, because I don't know any AI design that incorporates
CFAI (and my understanding is, you don't have one yet, but you're working on

I can solve problem 1 by giving you detailed information about Novamente
(privately, off list), though it will take you many many days of reading
and asking questions to really get it (it's just a lot of information).

Problem 2 however will only be solved by you completing your current AI
design task!!

I don't mean to say that I'll only accept your claims about the CFAI goal
architecture based on mathematical or empirical proof. I am willing to be
convinced intuitively by verbal, conceptual arguments that make sense to me.
But so far, your published (online) arguments don't, although I find them
stimulating and interesting.

-- Ben G

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:37 MDT