Chapter 2



Artificial Intelligence

UNIT I

|CO.1 |Explain the concept behind problem representation paradigms & its characteristics, production system and defining problem as a state space |

| |representation. |

1.1 WHAT IS AI? HISTORY & APPLICATIONS

[pic]

[pic]

1.2 ARTIFICIAL INTELLIGENCE AS REPRESENTATION & SEARCH

AI Problems

AI is the study to make computers do things which at the moment, people do better fails to include some areas of potentially very large impact, namely problems that cannot now be solved well by either computers or people .

AI Problems

Samuel – Wrote a checkers playing program that not only played games with opponents but also used its experience at those games to improve its later performance.

Computers could perform well at those tasks simply by being fast at exploring a large number of solution paths & then selecting the best one. it was thought that this process required very little knowledge & could therefore be programmed easily later this assumption turned out to be false since no computer is fast enough to overcome the combinational explosions generated by most problems .

• Common sense reasoning: - reasoning of objects, Sequence of actions and consequences. Ex if you let go of something, it will fall to the floor and may be break.

GPS- General Problem Solver applied to several common sense task Newell, Shaw & Simon.

• Perception – Vision, Speech

• Natural Language – Understanding, Generation testing

AI Flourishing most as a Practical discipline as opposed to a purely research are primarily the domains that require only specialized expertise without the assistance commonsense knowledge.

Before embarking on a study of special AI problems & solutions techniques. It is to discuss if not to answer following :

1) What are our underlying assumption @

2) What kinds of techniques will be useful for solving AI problems?

3) At what level of details if at all we trying to model human intelligence?

4) How will we know when we have succeed in building an intelligent program?

Task Domain of AI

|Mundane Tasks |Formal Task |Expert Task |

|Perception |Games | Engineering |

|Vision |Chess |Design |

|Speech |Backgammon |Fault finding |

|Natural language |Chekers |Manufacturing planning |

|Problem of understanding |Go | |

|Spoken language is a perceptual problems |Mathematic |Scientific Analysis |

|               – Generation |Geometry |Medical Diagnosis |

|Translation |Logic |Financial Analysis |

|Commonsense reasoning |Integral calculus | |

|Robot Control |Providing ppts | |

| | | |

| | | |

| | | |

The Underlying Assumption:-

Physical Symbol system hypothesis:

- Newell & Simon – describe the physical symbol system as Consist of a set of entities called a symbol – expressions/patterns another type of entity- symbol structure – composed of no. of instances/ token related besides these structures, the system also contain a collection of process that operate on expressions to produce other expressions. Process of creation, modification, reproduction & destructions it is a machine that produces a through time an evolving collection of symbol structures.

The physical symbol system hypothesis:-

A physical symbol system has the necessary & sufficient means for general intelligent action by experimentation hypothesis is only a hypothesis. No way of providing or disprove it on logical grounds we may find that it is false – bulk of evidence – true but only way to determine its truth is by experimentation.

Computer- provides the perfect medium for this experimentation since intelligence – programs to perform selected task such as people.

As it has become increasingly easy to build computing machines so it has become increasingly possible to conduct empirical investigations of the physical symbol system hypothesis.

Attempt to reduce a particularly human activity the understanding of jokes, to a process of symbol manipulation is provided in the book mathematics and humor.

Physical symbol system will prove able to model some aspects of human intelligence & nor others.

Physical symbol system by hypothesis is two fold.

What is an AI technique?

AI problems spans a very broad spec are there any techniques that are appropriate for the solution of a variety of these problem besides the fact that they manipulate symbol.

How-technique is useful in solving AI tasks.

Intelligence requires knowledge, knowledge possess some properties

• It is voluminous.

• It is hard to characterize accurately.

• It is constantly changing.

• It differs from data by being organized in a way that corresponds to the ways it will be used. For ad to conclude that AI technique is a method that exploits knowledge that should be represented in such a way that.

• Knowledge captures generalizations.

• Situation that share important properties are grouped together else more & updating will be required who must provide it.

• It can easily be modified to correct errors & to reflect changes in the world & in our world view.

• It can be used in a great many situation even if it is not totally accurate or complete.

• It can be used to help overcome its own sheer _ _ _ _ _ _ _ _ _ _ _ __ _ _ __ .

• The range of possibilities that must usually be considered.

• There is some degree of independence between problems & problems solving techniques.

• Problems & series of approaches for solving each of them.

Ex: Tic tac – toe

Their complexity

Their use of generalizations.

The clarity of their knowledge.

The extensibility of their approach.

Representation of what is called AI techniques.

|1 |2 |3 |

| 4 |5 |6 |

|7 |8 |9 |

O – blank – 2

1 – x – 3

2 – 0 – 5

Question Answering

Three important AI techniques.

Search : Frame work in which any direct techniques can be embedded

Use of knowledge.

Abstraction :- Provides away of separating important features & variations from the many unimportant ones that would otherwise overwhelm any process.

It is not possible give a precise definition of an AI techniques.

The level of the model

What we are trying to do our goal to make intelligent things

Modeling human intelligence

Of easiest way to do things

Syllabus dictionary phrases computers can do non AI problems.

2nd Class of problems.

Model human intelligence commuters learn newspaper & answer the Q’s.

2) To enable computers to understand human reasoning.

3) To enable people to understanding computer reasoning.

4) To exploit what knowledge we can learn from people. Clues how to proceed.

5) To test psychological theories of human performance behavior of a paranoid person on a system perminal.

6) Level of individual newrons.

Human cognitive theories.

Goal of simulating human performance.

Goal of building an intelligent program.

In either way we need a good model of the processes involved in intelligent reasoning.

Field of cognitive science in which psychologists linguistics & computer scientists work together.

Criteria for success.

How well we knowledge have succeeded.

What is intelligence?

Alan Turing proposed the followed method of determining whether machine can think.

Turing Test.

2 – people & the machine to be evaluated.

Ask Question by typing

Interrogator ----------------------------- person/computer.

Separate Room knows only A & B.

Aims to determine which is the person & which is the machine.

Goal of machine is to fool the interrogator into believing that it is the person. If machine succeed them conclude machine thinks.

Chess rating in a same way as human

DENDRAL program to analyzes organize compounds to determine their structure programs that meets some performance standard for a particular to criteria for success.

1.3 PRODUCTION SYSTEM, BASICS OF PROBLEM SOLVING

*Production system:-

Structure the AI Problems in a way that facilitates describing & performing the search process. Production system provides such structures a set of rules each consisting of left side – that determines the applicability of the rule & a right side – that describes the operation to be performed Knowledge/ database – information must be structured in any appropriate way.

Control strategy: - to match to database & resolve conflict when several a rule applier

Family of general production system interpreters. Basic production system language, such as ops5 * Act. More complex hybrid system expert system shells knowledge based expert system.

General problem solving architectures like SOAR, a system based on a specific set of cognitively motivated hypothesis about the nature of problem solving the process of solving the problem can useful is modeled as a production system.

Control strategies

• The first requirement of a good control strategy is that it causes motion.

Ex Water jug problem ( indefinitely filling the 4 gallon jug.

• The second requirement of a good control strategy is that it be systematic.

Strategy causes motion but it is likely to arrive at the same state several times during the process – exploring a useless sequence of operators several times before we finally find a solution.

To build a system to solve a particular problem – 4 things.

1) Define the problem precisely. Initial situations, Final situations constitute acceptable solution.

2) Analyze the problem.

Important features

3) Isolate & represent the task knowledge.

4) Choose best problem solving technique & apply it them.

Problem classification: -

Generic ctrl strategy i.e. appropriate for solving a problem.

Production system characteristics:-

Diff classes of problems production system are a good way to describe the operations that can be performed in a search for a solution to a problem.

Two Q’s

1. can production system like problems, be described by the set of characteristics that shed some light on how they can easily be implemented

2. if so what relationship are there between problems types & the types of production systems best suited to solving the problems?

Monotonic Production system A.P.S in which the application of a rule never prevents the later application of another rule that could also have been applied at the time of the list rule was selected.

A non monotonic production system is one in which this is not true.

A partially commutative p.s is a p.s with the property that if the application of a particular sequence of rules transforms state X into state Y then any permutation of these rules i.e. allowable also transforms state x into state y

A communicative production system is a p.s that is both monotonic & partially communicative kinds of problems & kind of production system all problems can solved by all kinds of system.

Monotonic Non Monotonic

|Partially Commutative |Theorem providing |Robot Navigation |

|Non Partially commutative |Chemical synthesis |Bridge. |

Pcom- monot ( Ignorable pbms

Implemented without the ability to backtrack.

Non Monot – Partially commute – useful for problems in which changes occur but can be reversed & in which order of operation is not critical for (North), south, east, west.

To execute the path is not important

8 puzzles, & block problems are partially commutative.

Non partially commutative – useful in which irreversible changes occur chemical reaction – order is important in irreversible process it is particularly the first time .

1.4 EXAMPLE-WATER JUG PROBLEM

[pic]

[pic]

1.5 PROBLEM REPRESENTATION PARADIGMS

Play chess specify

• Starting position of the chess board.

• The rulers that define the legal moves.

• Board position that represent win implicit goal of not only playing but winning goal state king is under attack.

• 8x* array.

• Moves can be described as a set of rules two parts.

• Left side pattern to be matched against the current board position.

• Right side that describe the change to be made to the board position to reflect the move separate rule roughly 10120 possible board position.

• So many rules practical difficulties

• No person could ever supply a complete set of such rules. It would take too long & could certainly not be done without mistake.

• No program could easily handle those rules more storing problems.

To minimize the such problems

Use some convenient notations for describing pattern & substitutions

EX:

white pawn at

Square (file e, rank 2)

And

Square (file e, rank 3)

Is empty move pawn from square (file e, rank 2) to square (file e, rank 4)

And

Square (file e, rank 4)

Is empty

Problems of playing chess as a problem of moving around in a state space where each state corresponds to a legal position of the board. The state space representations form the basic of most of the AI methods. Its structure corresponds to the structure of problem solving in two important ways.

• It allows for a formal definition of a problem as the need to convert some given situation into some desired situation using on using a set of permissible operations.

• Permit to define a process for solving a problem with a combination of techniques search reach rule to find some path.

• Search is a very important process in the solutions of hard problems for which no more direct technique are available.

State space representation ex 4 gallon – 3 gallon no marker pumps how u can get exactly 2 gallons of water into the 4 gallon jug.

State space can be described as the set of ordered pairs of integers(x,y) such that x=0,1,2,3,4 & y=0,1,2,3 start state (0,0), goal(2,n).

Operators to be used to solve the problem left sides matched against the current state right side is the change after applying control structure that loops certain rules several ways of making selection If the 4 gallon jug is not full fill it (Stupid if the jug is full)

Ex Production rules for the water jug problem.

1) (x,y) --> (4,y) Fill the 4 gal jug

2) If x 9 (to generate a carry)

& M = 1

S + 1 + C3 > 9 & C3 is atmost 1

S + C3 > 8

O= 0 S + 1 + C3 ( 9

So N + R > 8

E < > 9 N + R cannot be greater than 18 even with a carry in i.e. G =1.

So E cannot be 9

Assume

Now no more constraints are generated then to make progress from here we have to guess.

Let E be assigned suppose E = 2 since it occurs three times.

Next cycle begins

N = E + 1 = 2 + 1 = 3

R = 8 or 9 N(3) + R + C1 (1,0) = 2 or but since N is already 3 the sum of these non –ve numbers cannot be less than 3.

R + 3 (0+1) = 12 and R = 8 or 9.

y.2 + D = Y OR 2 + D = 10 + Y e1 is generated.

Again no further constraints is generated a guess is required.

When C1 = 1 --- eventually reach deal end the system backtracks & try C1 = 0

2 + D = 10+y

D = 8 + y D = 8 or 9 already E =2 & y = 0 or i conflict

S; R, D cannot have same values i.e. 8 or 9 .

So C1 cannot be 1 it is realized initially some scarch could have been avoided.

Constraint propagation were not so sophisticated. Depends on reasoning.

Sophisticate constraint – in which the specific cause of the inconsistency is identifified & only constraints that depend on that culprit are undone. Others left out. This approach is called Dependency directed backtrack.

C1 = 0 2 + D = y

N + R = 10 + E R = 12 – 3 = 9 S = 8

D = 5 y = 6 ; D = 4 y = 7

2.8 Means End Analysis

Means – End Analysis:-

A collection of search strategies that can reason either forward or backward, but for a given problem, one direction or the other must be chosen. Often however a mixture of the two directions is appropriate.

Such a mixed strategy would make it possible to solve the major parts of problems first & then go back and solve the small problems that arise in “gling” the big process together. A technique known as Means – ends. Analysis is allows us to do that.

• Differences between the current state & the goal.

• Operator that can reduce the differences is formed.

• Operator cannot be applied to current state set up a sub problem of getting to a state to which it can be applied.

Ex. General Problem Solver.

Backward chaining in which operators are saluted & then sub goal are set up to establish the preconditions of the operators is called operator sub goaling.

Ex. Mathematical logic as the representation formalism.

Consider the English sentence.

Spot is a dog.

Represented in logic as dog (spot)

Logical representation of the fact that

All dogs have tails.

Vx: dog(x) has tail (x)

Using deductive mechanism of logic new representation object.

Has tail (spot)

Using backward mapping function generate English sentence spot has a tail.

Mapping functions are not one – to – one. In fact they are not even functions but rather many two sentences.

Ex. “All dogs has a at least one tail or the fact that each dog has several tails.

What facts the sentences represent & then convert those facts into the new representations.

UNIT III

|CO.3 |Explain the fundamentals of knowledge representation (logic-based, frame-based, semantic nets), inference and theorem proving ,Know how to build |

| |simple knowledge-based systems. |

3.1 KNOWLEDGE REPRESENTATION ISSUES

Knowledge Representation Issues

Ex best first knowledge.

It becomes clear that particular knowledge representation models allow for more specific more powerful problem solving mechanisms that operate on them.

Examine specific techniques that can be used for representing & manipulating knowledge within programs.

Representation & Mapping:-

Large amount of knowledge & mechanisms to manipulate that knowledge to create solutions to new problems.

-A variety of ways of representing knowledge (facts)

Two different kinds of entities.

• Facts :- truths in some relevant world

These are the things we want to represent.

• Representations of facts in some chosen formalism.

Things we are actually manipulating. Structuring these entities is as two levels.

• The knowledge level, at which facts concluding each agents behavior & current goals are described.

• The symbol level at which representations of objects at the knowledge level are defined in terms of symbols that can be manipulated by programs.

English understanding English generation

Manipulate I Rs. Of the facts it is given Reasoning programs. These manipulating results inner structures stored as R fig shows How three kinds of objects relate to each other ? Mapping between facts & Representations.

Focus on facts, Representations, and on the two mappings that must exist between them.

L as representations mappings.

Forward representation mapping. Maps from facts to representations

Backwards representation mapping goes the other way from representations to facts.

Representation – natural language particularly English sentences.

English Representation of those facts in order to facilitate getting information into and out of the system.

Mapping functions from English sentences to the representation. We are actually going to use & from it back to sentences.

Lines represent – attributes.

Boxed nodes – objects & values of attributes of objects.

Arrows – point from an object to its value along the corresponding attribute line slot & filler structure. Semantic network or a collection of frames.

viewing a Node as a Frame

Base ball player

Is a : Adult male.

Bats : (Equal Nanded)

Height: 6-1

Batting average: 252

Answer to the following queries.

• Team (Pee-Wee-Reese) = Brooklyn Dodgers.

• This attribute had a value stored explicitly in the knowledge base

• Batting – average (Three – finger – Brown ) = .106

- Instance attribute to Pitcher & extract the value stored there – best guesses.

In the face of a lack of more precise information in fact in 1906 Browns Bathing avg was .204 Height (Pee –wee – Reese) = Right.

• Bats (three-finger – Brown) = Right. Is a hierarchy – Baseball player?

Rule for computing a vale – (that for handed) as i/p person – Right handed.

❖ Inferential Knowledge:-

Property inheritance is a powerful form of inference.

Procedures some of which reason forward from given facts to conclusion.

Resolution which exploits a proof by contradiction strategy.

Logic provides a powerful structure in which to describe relationships among value it is often useful to combine this, or some other powerful description language, with is a hierarchy.

❖ Procedural knowledge:-

So far – knowledge have concentrated on relatively static declarative facts.

Another useful kind of knowledge is operational or procedural knowledge procedural knowledge can be represented in many ways. Simple code –

Code – more powerful since it makes exploits use of the name of the node whose value for handed is to be formed.

Use of production rules: they are argument with information on how they are to be used.

Important difference is in how the knowledge is used by the procedures that manipulate it.

Procedural knowledge as Rules.

If: ninth inning, and

Score is close, and

Less than 2 outs, and

First base is vacant, and

Batter is better hitter than next batter

Then: walk the batter.

Desired real reasoning

Forward representation backward representation

Mapping Mapping

Operation of

program

• Abstract reasoning process that a program is intended to model.

• Programmers do concrete implications of abstract concepts.

Approaches to knowledge Representation.

A good system for the representation of knowledge in a particular domain should possess the following four properties:-

• Representational adequacy the ability to represent all of the kinds of knowledge that are needed in that domain.

• Inferential Adequacy: - the ability to manipulate the representation structures in such a way as to derive new structures corresponding to new knowledge inferred from ol.

• Inferential Efficiency: - the ability to incorporate into the knowledge structure additional information that can be used to focus the attention of the inference mechanism in the most promising directions.

• Acquisitioned Efficiency: - the ability to acquire new information easily. The simplest case involves direct insertion by a person of new knowledge into the database.

No single system that optimizes all of the capabilities for all kinds of knowledge has yet been found. As a result multiple techniques for knowledge representations exist.

• Simple Relational Knowledge

|Player |Height |Weight |Bats - Throws |

| | | | |

| | | | |

| | | | |

| | | | |

• Very weak inferential capabilities not possible to answer.

Who is the heaviest player but if procedure is provided then these facts will enable the procedure to compute an answer.

• Providing support

❖ Inheritable Knowledge

is a

height

is a height

equal to handed s height

bats

is a is a

Batting average

Instance instance -- class membership

Useful form of inference is property inheritance, in which elements of specific classes inherit attributes & values from more general classes in which they are included.

Object are organized into classes --- must be arranged in a generalization hierarchy.

Issues in knowledge Representation.

• Are any attributes of objects so basic that they occur in almost every problem domain? If there are, we need to make sure that they are handled appropriately in each of the mechanisms we propose. If such attributes exist, what are they?

• Are there any important relationships that exist among attributes of objects

• At what level should knowledge be represented? Is there a good set of primitives into which all knowledge can be broken down? Is it helpful to use such primitives?

• How should sets of objects be represented?

• Given a large amount of knowledge stored in a database how can relevant parts be accessed when they are needed?

Important Attributes: - two attributes.

Instance & is A – they support properly inheritance they represent class membership & class inclusion or by predicate logic.

Relationship among Attributes.

Attributes themselves are the entities what properties do they have independent of the specific knowledge they encode for properties.

• Inverses: - represents both relationships in a single representation that ignores focus. Team (Pee-wee-Reese, Brooklyn- Dodgers)

How it is used depends on assertions.

• 2nd approach is to sue attributes that focus on a single entity but to sue them in pairs. One the inverse of the other.

Team = Brooklyn –Dodgers

Team members = Pee – Wee – Reese

❖ An is a Hierarchy of attributes – specializations of a attributes height specialization physical size --- physical attribute generalization.

❖ Techniques for Reasoning about values

• Information @ the type of the value

Height – measured in a unit of length.

• Constraints on the value often stated in terms of related entities. Age of a person – cannot be greater then parent age.

• Rules for computing the value when it is needed for bats attribute – backwards Rules also called as if needed rules.

• Rules that describe actions that should be taken if a value ever becomes known called – forward rules -- or if added rules

❖ Single valued attributes :-

Baseball player at any time can have only a single height & be a member of only one team introduces an explicit notation for temporal interval. If how different values are ever asserted for the same temporal interval, signal a contradiction automatically.

• Assume that the only temporal interval that is of interest is now. So if a new value is asserted replace the old value.

The Frame Problem:-

How to represent efficiently sequences of problem states that arise from a search process.

Consider the world of household robot facts like on (Plant 12, Table 34) under (Table 34, window 13) and in (Table 34, Room 15)

Too big no back Chair no back, too wide

too high, no back

Stool

Table

sideboard

drawn Desk no knee room

Fig. 4.11 A similarity Net

Thus whole problem of representing the facts that change as well as those that do not is known as the frame problem.

Frame axioms—in robot world – a table with a plant on it under the window, suppose we move the table to the centre of the room we must also infer that the plant is now in the center of the room too but that the window is not.

3.2 FIRST ORDER LOGIC, PREDICATE LOGIC

Using Predicate logic

Representing facts – the language of logic

Logic symbols “–” (material implication) “–” (not) “v” (OR) “Λ” (and) “” (for all) and “” (there exists)

5.1 Representing simple facts in logic:-

Explore the use of propositional logic as a way of representing the sort of world knowledge that an AI system might need.

We can easily represent real world facts as logical propositions written as well formed formulas (wff) in proper logic as shown in fig.

It is raining RAINING

It is sunny SUNNY

It is windy WINDY

It is raining then it is not sunny

Represent the fact stated by the classical sentence

Socrates is a Man SOCRATESMAN

Plato is a man PLATOMAN

The two are totally separate assertion & we would similarities between Socrates and Plato. It would be much better to represent these facts as:

MAN (SOCRATES) & MAN (PLATO)

Since now the structure of the representation reflects the structure of the knowledge.

But to do that we need to be able to use predicates applied to arguments.

All men are mortal MORTAL MAN

Fails to capture the relationship between any individual being a man & that individual being a mortal needs variables and quantification tow write know separeat st about the mortality of ever prosperity wondered aimlessly before found a companion in you.

Representation as a set of wff’s in predicate logic

1. Marcus was a man.

Man (Marcus)

If fails to capture some of the information in the English sentrence, notion of past tense .

2. Marcus was Pompeian.

Pompeian (Marcus)

3. All Pompeian are Romans

Ex : Pompeian (x) Roman (x)

4. Caesar was a ruler

Ruler (Caesar)

5. All Roman were either loyal to caesar or hated him.

[pic] (x) : Roman (x) loyal to (x, Caesar) [pic] hate (x, Caesar)

Exclusive OR.

[pic] (x) : Roman (x)

6. Everyone is loyal to someone

[pic] (x) : Ǝy : loyal to (x,y)

Scope of quantifiers or Ǝy : [pic]x : loyal to (x,y)?

7. People only try to assassinate rulers they are not loyal to

[pic] x: [pic]y : person (x) Ʌ ruler (y) Ʌ

Try assassinate (x,y) ( loyal to (x,y)

8. Marcus tried to assassinate Caesar.

Try assassinate (Marus, Caesar)

Was Marcus loyal to Caesar ?

Loyal to (Marcus, Caesar)

Fig. An attempt to prove ( loyal to (Marcus, Caesar)

Loyal to (Marcus, Caesar)

↑ (7, substitution)

Person (Marcus) Ʌ ruler (Caesar) Ʌ

Try assassinate (Marcus, Caesar)

↑ (4)

Person (Marcus)

Try assassinate (Marcus, Caesar)

↑ (8)

Person (Marcus)

9. All men are people

[pic] x : man (x) ( person (x)

Three important issues in converting English sentences into logical statements & then using those statement to deduce new ones.

• Many English statements are ambiguous choosing the correct interpretation may be difficult.

• Choice of representation simple representation.

• Even in very simple situations, a set of sentences is unlikely to contain all the information necessary to reason about the topic at hand.

Representing Instance & Is a Relationships.

Captured the relationships they are used to express namely class membership & class inclusion.

Ist part already represented in the representations class membership is represented with unary predicates (such as Roman) each of which corresponds to a class. Asserting that P(x) is true is equivalent to asserting that x is an instance (or element) of P.

2nd Class use the instance is a binary one and the 1st argument is an object & 2nd is ______ to which the object belongs do not use an explicit is a predicate instead sublass relationship – (3)

3rd Contains representation that use both the instance & is a predicate explicitly is a simplifies the use of (3) but it uses an extra axioms (6)

This additional axiom describes ho an instance relation & is a relation can be combined to derive a new instance relation.

Fig. Three way of Representing class membership.

1) Man (Marcus) 1) instance (Marcus, man)

2) Pompeian (Marcus) 2) instance (Marcus, Pomp)

3) [pic] x : Pomp (x) ( Roman (x) 3) [pic] x : instance (x, Pomp)( inst (x, Roman)

4) Ruler (Caesar) 4) instance (Caesar, ruler)

5)    [pic] x : Roman (x) ( loyal to (x, Caesar) 5) [pic] x : instance (x, Roman) (

˅ hate (x, Caesar) loyal to (x, Caesar) ˅ hate (x, Caesar)

1. Inst (Marcus, man)

2. inst (Mar, Pampion)

3. Is a (Pompeian, Roman)

4. Instance (Caesar, ruler)

5. [pic] x : instance (x, Roman) ( loyal to (x, Caesar) ˅ hate (x, Caesar)

6. [pic] x : [pic] y : [pic] z : instance (x,y) Ʌ is (y,2) ( instance (x, 2)

Forward Reason :

- Branching factor is great

- Use some heuristic rules for deciding which answer is more likely & then try to prove that one first

- If it fails efforts is loosed.

- Simpley try both answers simultaneously & stop one effort is successful.

3.3 Unification

UNIFICATION ALGORITHM

In propsoitional logic it is easy to determine that two literals can not both be true at the same time. Simply look for L and ~L . In predicate logic, this matching process is more complicated, since bindings of variables must be considered.

For example man (john) and man(john) is a contradiction while man (john) and man(Himalayas) is not. Thus in order to determine contradictions we need a matching procedure that compares two literals and discovers whether there exist a set of substitutions that makes them identical . There is a recursive procedure that does this matching . It is called Unification algorithm.

 

In Unification algorithm each literal is represented as a list, where first element is the name of a predicate and the remaining elements are arguments. The argument may be a single element (atom) or may be another list. For example we can have literals as

( tryassassinate Marcus Caesar)

( tryassassinate Marcus (ruler of Rome))

To unify two literals , first check if their first elements re same. If so proceed. Otherwise they can not be unified. For example the literals

( try assassinate Marcus Caesar)

( hate Marcus Caesar)

Can not be Unfied. The unification algorithm recursively matches pairs of elements, one pair at a time. The matching rules are :

i) Different constants , functions or predicates can not match, whereas identical ones can.

ii) A variable can match another variable , any constant or a function or predicate expression, subject to the condition that the function or [predicate expression must not contain any instance of the variable being matched (otherwise it will lead to infinite recursion).

iii) The substitution must be consistent. Substituting y for x now and then z for x later is inconsistent. (a substitution y for x written as y/x)

The Unification algorithm is listed below as a procedure UNIFY (L1, L2). It returns a list representing the composition of the substitutions that were performed during the match. An empty list NIL indicates that a match was found without any substitutions. If the list contains a single value F, it indicates that the unification procedure failed.

UNIFY (L1, L2)

1. if L1 or L2 is an atom part of same thing do

(a) if L1 or L2 are identical then return NIL

(b) else if L1 is a variable then do

(i) if L1 occurs in L2 then return F else return (L2/L1)

© else if L2 is a variable then do

(i) if L2 occurs in L1 then return F else return (L1/L2)

else return F.

2. If length (L!) is not equal to length (L2) then return F.

3. Set SUBST to NIL

( at the end of this procedure , SUBST will contain all the substitutions used to unify L1 and L2).

4. For I = 1 to number of elements in L1 do

i) call UNIFY with the i th element of L1 and I’th element of L2, putting the result in S

ii) if S = F then return F

iii) if S is not equal to NIL then do

(A) apply S to the remainder of both L1 and L2

(B) SUBST := APPEND (S, SUBST) return SUBST.

3.4 STRUCTURED KNOWLEDGE REPRESENTATION

Strong Slot & Filler Structures: -

No hard & fast rules @ what kinds of objects & links are good in general for knowledge representation. CD, scripts & cyc embody specific notions of what types of objects & relations are permitted.

10.1 Conceptual Dependency : - is a theory of how to represent the kind of knowledge about events that is usually contained in natural language sentences.

The goal is to represent knowledge that

• Facilitates drawing inferences from the sentences

• Is independent of the language in which the sentences were originally stated.

Conceptual primitives that can be combined to form the meanings of words in any particular language.

• Structure & a specific set of primitives.

• I gave the man a book.

I ATRANS book

• Arrows indicate direction of dependency

• Double arrow indicates two way link between actor & action.

• P indicates past tense.

• ATRANS is one of the primitive acts used by the theory it indicates transfer of possession

• O indicates the object case relation

• R indicates the recipient case relation.

A set of primitive acts. Actions are built.

ATRANS Transfer of an abstract relationship (e.g. Give)

PTRANS Transfer of the physical location of an object (e.g. go)

PROPEL Application of physical force to an object (e.g. push)

MOVE Movement of a body part by its owner (e.g. kick)

GRASP Grasping of an object by an actor (e.g. clutch)

INGEST ingestion of an object by an animal (e.g. eat)

EXPEL Expulsion of something from the body of an animal. (e.g. cry)

MTRANS Transfer of mental information (e.g. tell)

MBUILD Building new information out of odd (e.g. decide)

SPEAK Production of sounds (e.g. say)

ATTEND Focusing of a sense organ toward a stimulus (e.g. listen)

2nd Set – Set of allowable dependencies among the conceptualizations described in a sentence.

ACTS - Actions

PPs - objects (picture procedure)

AAs - modifiers of actions (action aiders)

PAs - modifiers of PPs (Picture aiders)

The set of conceptual tenses

P past ? interrogative

F future / negative

t transistion nil present

ts start transmission delta timeless

tf finished transition C conditional

K continuing

1) John ran John PTRANS

2) John is tall John height (>average)

3) John is a doctor John doctor

4) A nice boy boy

nice

5) John’s dog boy

nice

6) John pushed the car John PROPEL car

John

7) John took the book from Mary John ATRANS

Mary

book

* I gave the man a book

8) John ate ice cream with a spoon John INGEST

do

ice cream

spoon

9) John fertilized the field John PTRANS

Fertilized

size > x

10) The plants grew plants

size = x

11) Bill shot Bob Bill PROPEL bullet

Bob

12) John ran yesterday John PTRANS

13) While going home I saw a frog I PTRANS t

I MTRANS frog

woods

14) I heard a frog in the woods MTRANS frog

Ex :

1) Ram ate the hot cake Ram INGEST

Hot cake

2) Rajiv will go to Delhi Rajiv PTRANS

3) Arjun saw laxmi on the Arjun MTRANS

Hill with a telescope

4) Shyam qualified the exam Shyam PTRANS

In the first class

3.5 BACKWARD CHAINING ,RESOLUTION

[pic]

[pic]

3.6 RESOLUTION

Resolution :

Precisely one of winter & the winter will be true at any point if winter is true then cold must be true to guaranteed the truth of the 2nd clause.

A proof procedure that carried out in a single operation the variety of processes involved in reasoning with statements in predicate logic.

It produces proof by refutation given any two clauses A and B if there is a literal P1 in A which has a complementary literal P2 in B delete P1 & P2. From A & B and construct a disjunction the remaining clauses.

The clauses so constructed called the resolvent of A and B.

1) A : P ˅ Q ˅ R

B : - P ˅ Q ˅ R

C : - Q ˅ R

2) A : P ˅ Q ˅ R

B : - P ˅ R

C : - Q

D : - R

Example : Winter ˅ summer

Winter ˅ cold

Summer ˅ cold ( new clause

* Theorem Proving inferred using resolution from old.

Two methods

i) Start with the given axioms use the rules of inference & prove

ii) Prove that the negation of the result is not true.

Given that

a) Physician (Bhaskar) ….. (2)

b) [pic] x : Physician (x) ( knows surgery (x) ……(3)

Method 2

Knows _surgery (Bhaskar) ……(1)

Equation (3)

Physician (x) ( knows surgery (x) ……(4)

It gives substitute x = Bhaskar

- Physician (x) ( knows surgery (x) …..(6)

It contradicts assumption that was made.

3.7 SEMANTIC NETS

Semantic Networks.

Semantic Network is a structure for representing knowledge as a pattern of interconnected nodes and arcs. It is also defined as a graphical representation of knowledge.

The objects under consideration serves as nodes & the relationships with another node give the arcs.

Nodes represent

Entities, Attributes, States or Events Arcs in the network give the relationship between the nodes & Labels on the arc specify what type of relationship actually exists.

Weak slot and filler structures.

Knowledge is structural as a set of entities and their attributes.

“Knowledge poor” structures “weak”

Is a & instance relations.

A semantic Network

Question : What is the connection between the Brooklyn Dodgers & Blue.

This kind of reasonin exploits one of the important advantages that slot & filler structures have over purely logical representation because of entity based organization.

* John gave the book to Marry

John is taller than Bill

Partitioned Semantic Nets

The dog bit the mail carrier nodes d,b, & m represent a particulare dog a particular biting & a particular mail carrier. A single net with no partitioning.

Every dog has bitten a mail carrier …. (b)

x : Dog (x) ( Ǝ y : Mail carrier (y) Ʌ Bite (x,y)

Node g ( stands for the assertion given above

g ( is an instance of the special class gs of general statements about the world (i.e. those with universal quantifiers)

Every element of gs has too attributes a form which states the relation being asserted & one or more [pic] connections.

For every dog d there exist ability event b & a mail carrier m.

Every dog in town has bitten the constable.

C( lies outside existential quantifier. Thus it is not viewed as an existentially quantified variable whose value may depend on the value of d.

Every dog has bitten every mail carrier

Space S1 is included in SA

3.8 FRAMES

FRAMES :- means of representing common sense knowledge. Knowledge is organized into small packets called “Frames”. All frames of a given situation constitute the system.

A frame can be defined as a d at a structure that has slots for various objects & a collection of frames consist of expectation for a given situation.

Frame are used to represent two types of knowledge viz. declarative/factual and procedural, declarative & procedural Frames: -

A frame that merely contains description about objects is call a declarative type/factual situational frame.

| |

|Name : Computer Centre |

| |

|A/c |

|Stationary cupboard |

| |

|Computer |

|Dumb terminals |

| |

| |

| |

| |

|Printer |

| |

Name of the frame

Slots in the frame

Frames which have procedural knowledge embedded in it are called action procedure frames. The action frame has the following slots.

• Actor slot which holds information @ who is performing the activity.

• Source Slot hold information from where the action has to begin.

• Destination slot holds information about the place where action has to end.

• Task slot This generates the necessary sub frames required to perform the operation.

| |

|Name : Cleaning the ict of carburetor |

| |

|Actor |

| |

| |

|Expert |

| |

| |

| |

|Object |

| |

|  Source Destination |

|Scooter |Scooter |

| Task 1 Task 2 Task 3 |

|Remove Carburetor |Clean Nozzle |Fix Carburetor |

| |

|Name : Remove Carburetor |

| |

|Actor Object |

| |

| |

|Source Destination |

| Task 1 Task 2 Task 3 |

|Remove Carburetor |Clean Nozzle |Fix Carburetor |

Linking of procedural sub frames

3.9 SCRIPTS, ONTOLOGY.

Scripts : -

A mechanisms for representing knowledge about common sequences of events.

A script is a structure that describes a stereotyped sequence of events in a particular content consist of slots ( contains values/default values.

Components of a script

• Entry conditions – conditions before the events described in the script can occur.

• Result – conditions that will in general be true after the events described in the script have occurred.

• Props - slots representing objects that are involved in the event described in the script.

• Roles – Slots representing people who are envolved in the events described in the script.

• Track – The specific variation on a more general pattern that is represented by this particular script.

• Scenes – The actual sequences of events that occur.

Pseudo form of a restaurant script

|Script : Going to a restaurant |Scene1 : Entering the restaurant. |

| |Enters the restaurant. |

|Props : Food |scans the tables chooses the best one. |

|Tables |decides to sit there. |

|Menu |goes there. |

|Money |occupies the seat. |

|Roles : Owner | |

|Customer | |

|Waiter | |

|Cashier | |

|Entry conditions |Scene 2: Ordering the food. |

|Customer is hungry |Customer asks for menu. |

|Customer has money |Waiter brings it. |

|Owner has food. |Customer glances it. |

| |Chooses what to eat. |

| |Orders that item. |

|Results : |Scene 3 : |

|Customer is hungry. |Eating the food. |

|Owner has more money. |Waiter brings the food. |

|Customer has less money. |Customer eats it. |

|Owner has less food. | |

| | |

UNIT IV

|CO.4 |Demonstrate working knowledge of reasoning in the presence of incomplete and/or uncertain information by applying Bayesian Networks and Fuzzy |

| |Logic. |

4.1 HANDING UNCERTAIN KNOWLEDGE

Uncertainty

Definition: Uncertainty means that many of the simplifications that are possible with deductive inference are no longer valid.

Why does uncertainty arise?

✓ Agents almost never have access to the whole truth about their environment.

✓ Agents cannot find a categorical answer.

✓ Uncertainty can also arise because of incompleteness, incorrectness in agents understanding of properties of environment.

• To act rationally under uncertainty we must be able to evaluate how likely certain things are.

• With FOL a fact F is only useful if it is known to be true or false.

• But we need to be able to evaluate how likely it is that F is true.

• By weighing likelihoods of events (probabilities) we can develop mechanisms for acting rationally under uncertainty.

4.2 RATIONAL DECISIONS, BASICS OF PROBABILITY

Probabilistic reasoning

Using logic to represent and reason we canrepresent knowledge about the world with facts and rules, like the following ones:

bird(tweety).

fly(X) :- bird(X).

We can also use a theorem-prover to reason about the world and deduct new facts about the world, for e.g.,

?- fly(tweety).

Yes

However, this often does not work outside of toy domains - non-tautologous certain rules are hard to find.

A way to handle knowledge representation in real problems is to extend logic by using certainty factors.

In other words, replace

IF condition THEN fact

with

IF condition with certainty x THEN fact with certainty f(x)

Unfortunately cannot really adapt logical inference to probabilistic inference, since the latter is not context-free.

Replacing rules with conditional probabilities makes inferencing simpler.

Replace

smoking -> lung cancer

or

lotsofconditions, smoking -> lung cancer

with

P(lung cancer | smoking) = 0.6

Uncertainty is represented explicitly and quantitatively within probability theory, a formalism that has been developed over centuries.

A probabilistic model describes the world in terms of a set S of possible states - the sample space. We don’t know the true state ofthe world, so we (somehow) come up with a probability distribution over S which gives the probability of any state being the true one. The world usually described by a set of variables or attributes.

Default Reasoning & the closed world assumption.

-- uncertainty as a result of incomplete knowledge.

-- plausible default assumptions. Pat @ 20 yr old normally assume that

-- Learned that Pat has suffered from blackouts.

-- U will be forced to revise your beliefs.

Expressed as a (x) : Mb1 (x) ……. Mb2 (x)

c (x)

a(x) ( precondition wff for the conclusion wff c(x)

M is the consistency operator & the bi (x) are conditions each of which must be separately consistent with the KB for the conclusion c (x) to hold.

Symbolic Reasoning under uncertainty.

Introduction to Non monotonic Reasoning.

ABC Murder story.

Abbott, Babbit & Cabot be suspects in a murder case

- Uncertain fuzzy & often changing knowledge

- Non monotonic Reasoning :- axioms and/or the rules of inference are extended to make it possible to reason with incomplete information.

These systems preserve however the property that any given moment a statement is

either believed to be tree, to be false or not believed to be either.

- Statistical Reasoning :- Representation is allowed to have numeric measure of certainty.

• Abbot has an alibi, in the register of a respected hotel.

• Babbit has a alibi, for his brother in law testified that babbit was visiting him in Brooklyn at the time.

• Cabot pleads alibi too, claiming to have been watching a ski meet in the cat skills.

Belief

1) The Abbott did not commit the crime.

2) The Babbitt did not.

3) That Abbott or Babbitt or Cabot did.

Goodluck. Cabot documents his alibi caught by television in the sidelines at the skimees.

A new belief thrust upon us.

4) That cabot did not.

- Technique for maintaining several parallel belief spaces.

- Conventional Reasoning systems 1st order logic are designed to work with information that has three important properties.

i) It is complete with respect to the domain of interest.

ii) It is consistent.

iii) New facts can be added as they become available – monotonicity.

Nonmonotonic reasoning systems on the other hand are designed to be able to solve problems in which all of these ppts are missing.

No reason to suspect the crime then assume he didn’t

Make clear the distinction between .

• It is known that >P

• It is not known whether P.

Predicate logic 1st instance 2nd as well a system.

Any inference that depends on the lock of some piece of knowledge a non monotonic inference.

- Inferences based on lack of knowledge.

- A new assertions are made.

- Defeasible nonmonotic inference may be defeated (rendered invalid)

N.M.R. doesn’t share this ppt T ( entails trust then T combined with N also entails W.

- How can a knowledge base be updated properly.

Valid sets of justification.

Abott is in town this week & so is available to testify but if we wait until next week he may be out of town.

- Techniques for maintaining valid sets of justifications.

- How can knowledge be used to help resolve conflicts when there are several inconsistent non-monotonic inferences that could be drawn.

- Contradiction to resolve.

- Locally consistent ( globally inconsistent no options to believe all of them at once.

Statistical Reasoning

- Problem in which genuine randomness in the world – playing card.

- Likelihood of various outcomes exploit it.

- 2nd class of Problem - No randomness behaves normally – unless somekind of exception – common sense tasks : for which statistical function ( as summaries of the world.

- Numerical summary that tells us how often an exceptions of some sort can be expected to occur.

* Probability & Baye’s Theorem.

- Collect evidence &

- Modify its behavior on the basis of the evidence.

Bayesian statistics is a statistical theory of a evidence

Conditional pbt P (H/E)

Pbt of hypothesis - H given that E ( evidence observed.

For which we require

Prior pbt of H & the extent to which E provides evidence of H.

Define universe.

Then let

P (Hi/E) – pbt that hypothesis Hi is true given E

P (E /Hi) - pbt that we will observe evidence E given that hypothesis i is true.

P (Hi) - a prior pbt that hypothesis i is a true in the absence of any specific evidence.

K - no. of possible hypothesis.

Bayes theorem then states that

P (Hi/E) = (E /Hi) . P (Hi)

Σkn=1 P (E/Hn).)(Hn)

* Examining the geological evidence at a particular location to determine whether that would be a good place to dig to find a desired mineral

Minerals ( Copper, Uranium.

* Medical diagnosis problem.

S : Patient has spots.

M : Patient has measles.

F : Patient has fever.

- Conditional pbts that arises from their conjuction.

- Given a prior body of evidence e & some new observations E we need to compute

P (H/E, e) = P (H/E). P(e/E,H) joints pbts.

P(e/E)

Bayes theorem intractable for several reasons.

• Knowledge acquisition problems in surmountable.

- Too many plots – substantial empirical evidence – people are poor pbt estimation.

• Space to store pbts too large.

• Time required to compute pbts too large.

Despite these problems. B.T. attractive basis for an uncertain reasoning system.

* Certain factos & Rule based system

MYCIN ( Attemps to recommend appropriate therapies with bacterial infections.

Certain by fact ( in the rules consequent.

Rule

If

i) Stain of organism is gram positive.

ii) Morphology ( Coccus &

iii) Growth conformation ( clumps.

Action

Then identify of the organism is staphylo coccus

Q. Certainty factor – measure between 0 to 1

If 0 evidence fails to support the hypothesis measure.

Cf = evidence either supports or denies a hypothesis.

- Strong independence assumptions that make it relatively easy to use.

- Assumptions create dangers if rules are not written carefully.

S : Sprinkler was on last night

W : Grass is wet

R : it rained last night

Rules:-

If the sprinkler was on last night then there is suggestive evidence (O.S) that the grass will be wet this morning.

C.F. – 0.8 sprinkler suggest wet.

- 0.72 wet suggest rain.

Believe that it rained because we believe the sprinkler was on.

Danger whenever justification of a belief are important to determining its consequences.

Need to know – why we believe the grass is wet.

Bayesian Networks :-

CF ( as a mechanism for reducing the complexity of a Bayesian reasoning system.

B-N /ws ( preserve the formalism & rely instead on the modularity of the world.

( constraints networks

( ways of representing knowledge as sets of constraints.

2 ways propositions can influence the likelihood of each other

i) Causes influence the likelihood of their symptoms.

ii) Observing a symptom affects the likelihood of all of its possible causes.

Bayesian Network ( make a clear distinction @ two influences.

Representing causality uniformly

A graph contains an additional node corresponding to the propositional variable that tell us whether it is currently a raining season.

- More information is needed to use as probabilistic reasoning.

- Pbt tablers are provided.

- The pbt of rain on given night is 0.9

> rain is 0.1

- Need a mechanism for computing the influence of any arbitrary node on any other.

Fig conditional pbts for a Bayesian Network

Attribute pbt

P(wet/sprinkler, Rain) 0.95

P(wet/7sprinkler, Rain) 0.9

(Rain) 0.8

P(wet/7sprinkler, Rain)

(Rain) 0.1

4.3 AXIOMS OF PROBABILITY

Review of probability

Given a set U (universe), a probability function is a function defined over the subsets of U that maps each subset to the real numbers and that satisfies the Axioms of Probability

1. Pr(U) = 1

2. Pr(A) ∈[0,1]

3. Pr(A ∪B) = Pr(A) + Pr(B) –Pr(A ∩B)

Note if A ∩B = {} then Pr(A ∪B) = Pr(A) + Pr(B)

The primitives in probabilistic reasoning are random variables. Just like primitives in Propositional Logic are propositions. A random variable is not in fact a variable, but a function from a sample space S to another space, often the real numbers.

For example, let the random variable Sum (representing outcome of two die throws) be defined thus:

Sum(die1, die2) = die1 +die2

Each random variable has an associated probability distribution determined by the underlying distribution on the sample space

Continuing our example: P(Sum = 2) = 1/36, P(Sum = 3) = 2/36, . . . , P(Sum = 12) = 1/36

Consider the probabilistic model of the fictitious medical expert system mentioned before. The sample space is described by 8 binary valued variables.

Visit to Asia? A

Tuberculosis? T

Either tub. or lung cancer? E

Lung cancer? L

Smoking? S

Bronchitis? B

Dyspnoea? D

Positive X-ray? X

There are 28= 256 events in the sample space. Each event is determined by a joint instantiation of all of the variables.

S = {(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = f),

(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = t), . . .

(A = t, T = t,E = t,L = t, S = t,B = t,D = t,X = t)}

Since S is defined in terms of joint instantiations, any distribution defined on it is called a joint distribution. ll underlying distributions will be joint distributions in this module. The variables {A, T, E, L, S, B, D, X} are in fact random variables, which ‘project’ values.

L(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = f) = f

L(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = t) = f

L(A = t, T = t,E = t,L = t, S = t,B = t,D = t,X = t) = t

Each of the random variables { A, T, E, L, S, B, D, X } has its own distribution, determined by the underlying joint distribution. This is known as the margin distribution. For example, the distribution for L is denoted P(L), and this distribution is defined by the two probabilities P(L = f) and P(L = t). For example,

P(L = f) = P(A = f, T = f,E = f,L = f,S = f,B = f,D = f,X = f)

+ P(A = f, T = f,E = f,L = f,S = f,B = f,D = f,X = t)

+ P(A = f, T = f,E = f,L = f, S = f,B = f,D = t,X = f)

. . .

P (A = t, T = t,E = t,L = f, S = t,B = t,D = t,X = t)

P (L) is an example of a marginal distribution.

Here’s a joint distribution over two binary value variables A and B.

[pic]

We get the marginal distribution over B by simply adding up the different possible values of A for any value of B (and put the result in the “margin”).

[pic]

In general, given a joint distribution over a set of variables, we can get the marginal distribution over a subset by simply summing out those variables not in the subset.

In the medical expert system case, we can get the marginal distribution over, say, A, D by simply summing out the other variables:

[pic]

However, computing marginals is not an easy task always. For example,

P(A = t, D = f)

= P(A = t, T = f,E = f,L = f,S = f,B = f,D = f,X = f)

+ P(A = t, T = f,E = f,L = f,S = f,B = f,D = f,X = t)

+ P(A = t, T = f,E = f,L = f, S = f,B = t,D = f,X = f)

+ P(A = t, T = f,E = f,L = f, S = f,B = t,D = f,X = t)

. . .

P(A = t, T = t,E = t,L = t, S = t,B = t,D = f,X = t)

This has 64 summands! Each of whose value needs to be estimated from empirical data.

For the estimates to be of good quality, each of the instances that appear in the summands should appear sufficiently large number of times in the empirical data. Often such a large amount of data is not available.

However, computation can be simplified for certain special but common conditions. This is the condition of independence of variables.

Two random variables A and B are independent iff

P(A,B) = P(A)P(B)

i.e. can get the joint from the marginals

This is quite a strong statement: It means for any value x of A and any value y of B

P(A = x, B = y) = P(A = x)P(B = y)

Note that the independence of two random variables is a property of a the underlying

[pic]

probability distribution. We can have

Conditional probability is defined as:

[pic]

It means for any value x of A and any value y of B

[pic]

If A and B are independent then

[pic]

Conditional probabilities can represent causal relationships in both directions.

From cause to (probable) effects

[pic]

From effect to (probable) cause

[pic]

4.4 BAYE’S RULE AND CONDITIONAL INDEPENDENCE , BAYESIAN NETWORKS

Bayesian Networks

Representation and Syntax

Bayes nets (BN) (also referred to as Probabilistic Graphical Models and Bayesian Belief Networks) are directed acyclic graphs (DAGs) where each node represents a random variable. The intuitive meaning of an arrow from a parent to a child is that the parent directly influences the child. These influences are quantified by conditional probabilities.

BNs are graphical representations of joint distributions. The BN for the medical expert system mentioned previously represents a joint distribution over 8 binary random variables {A,T,E,L,S,B,D,X}.

[pic]

Fig.4.1 Bayesian Network for Medical Expert System

Conditional Probability Tables

Each node in a Bayesian net has an associated conditional probability table or CPT. (Assume all random variables have only a finite number of possible values). This gives the probability values for the random variable at the node conditional on values for its parents. Here is a part of one of the CPTs from the medical expert system network.

[pic]

If a node has no parents, then the CPT reduces to a table giving the marginal distribution on that random variable.

[pic]

Consider another example, in which all nodes are binary, i.e., have two possible values, which we will denote by T (true) and F (false).

[pic]

Fig.4.2: Conditional Probability Tables

We see that the event "grass is wet" (W=true) has two possible causes: either the water sprinkler is on (S=true) or it is raining (R=true). The strength of this relationship is shown in the table. For example, we see that Pr(W=true | S=true, R=false) = 0.9 (second row), and hence, Pr(W=false | S=true, R=false) = 1 - 0.9 = 0.1, since each row must sum to one. Since the C node has no parents, its CPT specifies the prior probability that it is cloudy (in this case, 0.5). (Think of C as representing the season: if it is a cloudy season, it is less likely that the sprinkler is on and more likely that the rain is on.)

Semantics of Bayesian Networks

The simplest conditional independence relationship encoded in a Bayesian network can be stated as follows: a node is independent of its ancestors given its parents, where the ancestor/parent relationship is with respect to some fixed topological ordering of the nodes. In the sprinkler example above, by the chain rule of probability, the joint probability of all the nodes in the graph above is,

P(C, S, R, W) = P(C) * P (S|C) * P(R|C,S) * P(W|C,S,R)

By using conditional independence relationships, we can rewrite this as

P(C, S, R, W) = P(C) * P(S|C) * P(R|C) * P(W|S,R)

Where we were allowed to simplify the third term because R is independent of S given its parent C, and the last term because W is independent of C given its parents S and R. We can see that the conditional independence relationships allow us to represent the joint more compactly. Here the savings are minimal, but in general, if we had n binary nodes, the full joint would require O(2^n) space to represent, but the factored form would require O(n 2^k) space to represent, where k is the maximum fan-in of a node. And fewer parameters make learning easier.

The intuitive meaning of an arrow from a parent to a child is that the parent directly influences the child. The direction of this influence is often taken to represent casual influence. The conditional probabilities give the strength of causal influence. A 0 or 1 in a CPT represents a deterministic influence.

[pic]

Decomposing Joint distributions

A joint distribution can always be broken down into a product of conditional probabilities using repeated applications of the product rule.

[pic]

We can order the variables however we like:

[pic]

Conditional Independence in Bayes Net

A Bayes net represents the assumption that each node is conditionally independent of all its non-descendants given its parents.

So for example,

[pic]

Fig.4.3: Example for Conditional Bayes Net

Note that, a node is NOT independent of its descendants given its parents. Generally,

[pic]

Variable ordering in Bayes Net

The conditional independence assumptions expressed by a Bayes net allow a compact representation of the joint distribution. First note that the Bayes net imposes a partial order on nodes: X ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download