RE: [JOIN] Stephen Reed

From: ben goertzel (ben@goertzel.org)
Date: Mon May 20 2002 - 09:57:21 MDT


***
Cycorp is fairly well known but as an employee I can tell our story from
the perspective of the members of this mail list. We are working hard to
create a real AI. Along the way we have created a solid deductive
inference engine, the world's largest commonsense knowledge base,
incorporated a planner, and with a staff of computational linguists, made
a good start towards a hookup of Natural Language processing and the Cyc
KB.
***

Welcome to the list! I am very pleased to see someone from the Cyc project
join up!

I have studied Cyc somewhat closely -- I taught CycL in my AI class this
semester; and talked to Lenat as well as the Cycorp CEO (I forget his name
at the moment!) about a year ago, when I was considering the possibility of
going to work there.

But I decided the "philosophical mismatch" was too great... and I think
they decided so as well... we both kind of let the conversation drop...

I think the Cyc database will be a useful resource for an AGI someday, but
I personally don't have much optimism that the AI approach being taken at
Cycorp is going to lead to an AGI of any robustness and real generality,
flexibility and creativity....

My main objection is not to the CycL language -- I would use a different
knowledge representation formalism, but the CycL language is largely
convertible into the formalisms I prefer.

I'd frame my main objections as follows:

1) The knowledge being entered to the Cyc DB in is of a fairly restricted
nature, by and large. Crisp, clean propositions about named concepts are
useful but may not get you that far in terms of representing all the
knowledge needed by a mind. Compared to the knowledge in Cyc, I think that
most knowledge in a real mind is either

a) a lot more closely grounded in perception and action, OR
b) a lot more abstract and nebulous, involving complex concepts that are
difficult to name in single words or brief phrases.

These types of knowledge are hard for human encoders to encode in explicit
formulas, even though *in principle* they can be expressed in logical form.

2) I don't like the way you represent uncertainty in CycL. Compared to 1,
this is a very minor conceptual point. But I think that imposing a Bayes
net on each small, localized subset of knowledge just doesn't come close to
capturing the way uncertainty is used in the human mnd, or must be used in
any robust AGI. For one thing, I think that a single-number truth value
won't do it; for another thing, I think that the dag structure of a Bayes
net is too restrictive for a lot of knowledge domains. I don't think that
one pdf can be imposed on the whole knowledge base of a mind, nor do I
think that the mind can be segmented into little sections and a pdf imposed
on each one. I think that richer information about uncertainty must be
stored, and this must be used to construct pdf's on the fly, in a
goal-directed way, in the course of reasoning.

3) I worry that the predicate logic based reasoning engines you guys have
are far too "brittle" and not adequate to carrying out creative, complex
trains of thought that lead significantly beyond what's explicitly there in
the axioms. This is the area I know least about, because there is not as
much published on Cycorp's reasoning engines as on CycL. I think that any
approach that takes logical reasoning as the center of mental activity, and
views other things as ancillary to logical reasoning, is bound to fail. I
think that a reasoning engine has to be designed based on its behavior
*integrated with other mental subsystems* such as imaginative concept
creation, context-appropriate control of inference and concept creation,
communication, noninferential association formation, etc. Of course I
understand that Lenat and others on the Cyc team have *thought* about these
things, but my impression is that the vast majority of work has gone into
logical inference, and I think this is very unlikely to lead to an infer
ence engine that will work properly in the context of a whole mind. (I
could go on about this point in a huge amount of detail, of course -- it
leads to a lot of very interesting technical issues -- but this is intended
to be a brief e-mail)

4) Regarding NLP, it is good that Cyc is finally taking an interest in
this. When my friend Karin Verspoor, a computational linguist, interviewed
there about 5 years ago, there seemed to be no significant work in this
direction going on at Cycorp (hence she took a different job, at M$
Research in fact). I can see how Cyc is well positioned to exceed current
computational NLP approaches, by hybridizing (now-)standard corpus
linguistics approaches with lookups into the Cyc DB. This can be very
helpful for such things as semantic disambiguation, reference resolution of
various sorts, etc. However, I think that getting the semantic/syntax
interface right (the hard part of NLP) is not possible in a system that
focuses so narrowly on *logical inference from crisp or (Bayes net)
probabilistic propositions about named concepts*. I think that a lot of
the syntax/semantics interface has to be done by *a combination of logical
inference with other cognitive processes*, acting on *crisp, concise
propositions about named concepts, together with knowledge more closely
grounded in perception/action, together with more abstract and nebulous
intuitive knowledge*.

Of course, one reaction you might have to this is "Sure, I kind of agree
with your points, but our current system is not supposed to be a true AGI,
just a partway-there system.... We're approaching the problem from the
direction of 'crisp, concise propositions about named concepts' and
'logical inference with predicate logic & bayesian probabilities'. We
know we're far from a true AGI, but so is everybody else, and we think what
we're doing now may be a good platform for future work in directions
similar to the ones you suggest."

If this is your reaction, then what I would like to see is a paper or book
addressing the "grand Cyc vision" -- i.e. what do you (or Lenat etc.)
envision Cyc (the database and the AI system) looking like in, say, 5-20
years, when you have overcome limitations like the ones I've cited above
and made a more fully fleshed out AI system? Are there internal Cycorp
documents addressing this sort of thing?

In terms of my own work, I think the Cyc DB could potentially be
interesting to feed into Novamente (my own AI system, which is more
integrative in nature and less strictly focused on logic, although it does
have a probabilistic inference component; see www.realai.net for some
nontechnical info). I suspect I could also learn something from the way
you've specialized Cyc's inference engines to deal with particular types of
information, even though Novamente's inference engine does not use a
predicate logic foundation.

-- Ben Goertzel



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT