**From:** Ben Goertzel (*ben@goertzel.org*)

**Date:** Mon Feb 20 2006 - 18:27:17 MST

**Next message:**BillK: "Re: Friendliness not an Add-on"**Previous message:**Marcello Mathias Herreshoff: "Re: Friendliness not an Add-on"**In reply to:**Marcello Mathias Herreshoff: "Re: Friendliness not an Add-on"**Next in thread:**BillK: "Re: Friendliness not an Add-on"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Marcello,

*> Here's the problem. If you didn't know what algorithm the AI proper was
*

*> using, and you had no log file, you would run up against Rice's theorem here.
*

*> However, we do know what algorithm the AI is using and we might have a log
*

*> file. These are the only things preventing us from running into Rice's theorem.
*

Agreed.

*> Therefore, this means there must exist some reasonably efficient translation
*

*> algorithm for our AI's algorithm which will take in a conclusion and
*

*> optionally a log file and output a deductive justification, which can then
*

*> be checked. But, this is precisely what I meant by verifiable, so you can't
*

*> do that for a non-verifiable architecture.
*

I don't find the language or reasoning in the above paragraph

sufficiently clear.

Let me try one more time to spell out the specific sort of situation I

have been trying to discuss.

Suppose we have a system with a goal G, and suppose it arrives at a

theorem Q of the form

Q = "Executing subprogram S starting at a time point in interval T

will achieve goal G with probability at least p, based on

knowledge-base K"

Suppose it proves this theorem Q using a proved-to-be-correct

theorem-proving subsystem.

Suppose that the goal G embodies within itself an agreeable notion of

Friendliness.

Now, suppose that the theorem Q was found via a hard-to-predict

algorithm such as some variant of evolutionary programming, or some

sort of heuristic, abductive inference train involving metaphorical

leaps, etc.

What is your claim about this kind of AI architecture?

Are you claiming that it cannot be proved Friendly, even if the

algorithmic information of the whole thing is less than that of the

agent doing the proof?

I don't see why. It seems to me that one might be able to prove this

kind of system is reasonably able to achieve the goal G, as compared

to other sorts of AI systems operating with similar amounts of

computational resources.

In fact, my conjecture is that given a certain finite amount of

computational resources, the best AI systems involving hard-to-predict

hypothesis-generation subsystems will be *better* than the best AI

systems that don't involve such ---- thus suggesting that the

Friendliest AI's (best able to optimize the Friendliness goal G)

constructible given a certain amount of computational resources may be

the ones involving hard-to-predict hypothesis-generation subsystems.

*> Remember that if you do end up building a powerful enough AI system, the
*

*> burden of proof regarding its safety lies with you. If you don't use a nicely
*

*> formalized architecture, this step looks way harder.
*

I agree with this statement, however, it is not the case that using a

hard-to-predict hypothesis-generation subsystem implies using a

non-formalized or non-nicely-formalized architecture. Evolutionary

programming and estimation of distribution algorithms, for example,

are quite thoroughly formalized.

-- Ben G

**Next message:**BillK: "Re: Friendliness not an Add-on"**Previous message:**Marcello Mathias Herreshoff: "Re: Friendliness not an Add-on"**In reply to:**Marcello Mathias Herreshoff: "Re: Friendliness not an Add-on"**Next in thread:**BillK: "Re: Friendliness not an Add-on"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

*
This archive was generated by hypermail 2.1.5
: Wed Jul 17 2013 - 04:00:55 MDT
*