Knowledge Engineering
(Kỹ nghệ tri thức)
Khoat Than
Hanoi University of Science and Technology
IT4362, 2019
2
Contents
¡ Introduction to Knowledge Engineering
¡ Knowledge representation
¨ Production rules
¨ Frames
¡ Automatic induction and inference
3
Rule-based representation (1)
¡ Rules are the most prevalent type of knowledge
representation
¨ A rule provides some description of how to solve a problem.
¨ Rules are relatively easy to create and understand.
¡ In the form of IF A1 AND A2 AND ... AND An THEN B
¡ Ai:
¨ Are the conditions (or antecedents, premises)
¨ Match against facts which are stored in the working memory.
¡ B:
¨ The conclusion (consequence, action,…)
¨ To be added to the working memory.
4
Rule-based representation (2)
¡ The condition part of a rule
¨ Not need to use disjunctions (OR).
¨ A rule with disjunctions in the condition part is converted to a
set of corresponding rules with no disjunctions.
¨ E.g., The rule (IF A1∨A2 THEN B) is converted to the two rules (IF
A1 THEN B) and (IF A2 THEN B).
¡ The conclusion part of a rule
¨ Not need to use conjunctions (AND).
¨ A rule with conjunctions in the conclusion part is converted to a
set of corresponding rules with no conjunctions.
¨ E.g., The rule (IF … THEN B1∧B2) is converted to the two rules (IF
… THEN B1) and (IF … THEN B2).
¨ Not allow to use disjunction (OR).
5
Types of rules
¡ Associative relation
¨ IF addressAt(x, Hospital) THEN heathIs(x, Bad)
¡ Causal relation
¨ IF diseaseType(x, Infection) THEN tempIs(x, High)
¡ Situation and action (or recommendation)
¨ IF diseaseType(x, Infection) THEN takeMedicine(x, Antibiotic)
¡ Logical relation
¨ IF tempGreater(x, 37) THEN isFever(x)
6
AND-OR graph (1)
n IF (Shape=long) AND (Color=(green OR yellow)) THEN (Fruit=banana)
n IF (Shape=(round OR oblong)) AND (Diam > 4) THEN (Fruitclass=vine)
n IF (Fruitclass=vine) AND (Color=green) THEN (Fruit=watermelon)
Shape=long Fruit = banana
AND
Shape=round
OR Shape=oblong
Fruitclass = vine
Diam > 4
Color=green Fruit = watermelon
Color=yellow
7
AND-OR graph (2)
¡ Rule “IF (Shape=long) AND (Color=(green OR yellow)) THEN
(Fruit=banana)” is composed from:
¡ IF (Shape=long) AND (Color=green) THEN (Fruit=banana)
¡ IF (Shape=long) AND (Color=yellow) THEN (Fruit=banana)
¡ Rule “IF (Shape=(round OR oblong)) AND (Diam > 4) THEN
(Fruitclass=vine)” is composed from:
¡ IF (Shape=round) AND (Diam > 4) THEN (Fruitclass=vine)
¡ IF (Shape=oblong) AND (Diam > 4) THEN (Fruitclass=vine)
8
Problems with rules
¡ Infinite rules.
¨ If A then A,
¨ {If A then B, if B then C, if C then A}
¡ Inconsistent rules (rules contain contradictions).
¨ {If A then B, if B then C, if A and D then ¬C}
¡ Unreachable conclusions.
¡ Difficult to modify/update the knowledge base.
¡ Expensive to update the knowledge base.
9
Inference from rules
¡ Pattern matching
¨ To check whether or not a rule is applicable.
¨ E.g., If the knowledge base contains the set of rules {IF A1 THEN
B1, IF A1 AND A2 THEN B2, IF A2 AND A3 THEN B3} and the facts
(stored in the working memory) consist of A1 and A2, then the
first two rules are applicable.
¡ Chaining
¨ To associate (couple) the rules.
¨ Given a set of rules and a set of facts (premises), which of them
should be used, and in which order, to derive (reason) some
conclusion?
¨ Two strategies of chaining: forward vs. backward
10
Conflict resolution
¡ A conflict occurs when more than one rule match the facts
(contents) in the working memory.
¨ Note that a conflict is different from a contradiction
(inconsistence).
¡ A conflict resolution strategy (CRS) is needed to decide, in
case of a conflict, which rule is to fire (i.e., to be applied).
¡ An appropriate choice of CRS can make a significant
improvement to the system’s performance.
11
Conflict resolution strategy
¡ Select the first applicable rule (i.e., the one comes first in
the rule base).
¡ Don’t select rules that duplicate existing results (i.e., rules
whose application result in existing facts).
¡ Select the most specific rule (i.e., the one with the most
conditions attached).
¡ Prefer rules that match the most recent facts.
¡ Don’t allow a rule to fire twice on the same facts.
¡ Select the rule with the highest certainty (in an uncertain
rule base).
¡ A combination of the above strategies.
12
Rule-based systems (1)
¡ Architecture
Observed data
Working
select memory update
Rule
memory Interpreter
fire output
(http://www.cwa.mdx.ac.uk/bis2040/johnlect.html)
13
Rule-based systems (2)
¡ Working memory
Working
¨ Holds facts (data). memory
¨ Their presence/absence causes Rule
the interpreter to trigger certain rules. memory Interpreter
¡ Rule memory
¨ The knowledge base which holds rules.
¡ Interpreter
¨ The system is started by putting a suitable data item into the
working memory.
¨ When data in the working memory matches the conditions of
one of the rules in the rule memory, the rule fires (i.e., is brought
into action).
14
Rule-based systems: advantages
¡ Notational convenience
¨ Very close to expressions in a natural language.
¨ It is easy to express suitable pieces of knowledge by rules.
¡ Easy to understand
¨ IF-THEN rules are very easy (probably the easiest) to understand
to human.
¨ Easy for the experts in the specific domain, which the system is
concerned with, to criticize and improve.
15
Rule-based systems: disadvantages
¡ Restricted representation
¨ In many practical problems, useful pieces of knowledge don’t
fit the (IF-THEN) pattern.
¡ The interaction and the order of rules in a rule base may
cause some unexpected effects
¨ In the design and maintenance of a rule base, each (new) rule
can not be considered in isolation.
¨ It is very hard and high cost to consider all possible rule
interactions.
16
Frame-based representation (1)
¡ How to represent the knowledge of “The bus is yellow”?
¨ Difficult to use rules.
¡ Solution 1: Yellow(bus)
¨ The question “What is in yellow?” can be answered
¨ The question “What is the color of the bus?” cannot be
answered
¡ Solution 2: Color(bus, yellow)
¨ Can answer “What is in yellow?”, “What is the color of the
bus?”
¨ But cannot answer “Which property of the bus has value
yellow?”
¡ Solution 3: Prop(bus, color, yellow)
¨ Can answer all three questions.
17
Frame-based representation (2)
¡ Representation of an object:
¨ Known as object-property-value representation.
¡ If we merge many properties of the object of the same
type into one structure we get the object-centered
representation.
Object
Prop(Object, Property1, Value1) Property1
Prop(Object, Property2, Value2) Property2
… ...
Prop(Object, Propertyn, Valuen) Propertyn
18
Object-centered representation
¡ The object-property-value is a natural way to represent
knowledge about objects.
¡ Physical objects
¨ A desk has a surface-material, number of drawers, width,
length, height, color, etc.
¡ Situations:
¨ A class has a room number, participants, teacher, day, time,
etc.
¨ A trip has a departure, destination, transportation means,
accommodations, etc.
19
Frame
¡ Two types of frames: individual and generic
¡ Individual frames. To represent a single object like a person,
a trip, etc.
¡ Generic frames. To represent categories of objects, like
students, trips, etc.
¡ Example
¨ A generic frame: Europian_City
¨ An individual frame: City_Paris
20
Representation of a frame
¡ A frame is a named list of attributes called slots.
¡ What goes in the attribute is called a filler of the slot.
(frame-name
<slot-name1 filler1>
<slot-name2 filler2>
…
)
21
Individual frames
¡ An individual frame has a special slot called INSTANCE-OF
whose filler is the name of a generic frame.
¡ Example
(toronto % lowercase for individual frames
<:INSTANCE-OF CanadianCity>
<:Province ontario>
<:Population 4.5M>
…)
22
Generic Frame
¡ A generic frames may have IS-A slot whose filler is the
name of a generic frame.
¡ Example
(CanadianCity % uppercase for generic frames
<:IS-A City>
<:Province CanadianProvince>
<:Country canada>
…)
23
Frame: inference control (1)
¡ Slots in generic frames can have associated procedures
that are executed and ‘control’ inference
¡ Two types of procedures: IF-NEEDED and IF-ADDED
¡ IF-NEEDED procedure
¨ Executes when no slot filler is given and the value is needed
¨ E.g.,
(Table
<:Clearance [IF-NEEDED computeClearance]>
…)
computeClearance is a procedure to calculate the clearance
of the table.
24
Frame: inference control (2)
¡ IF-ADDED procedure
¨ If a slot filler is given its effect may propagate to other frames
(e.g., to assure constraints)
¨ E.g.,
(Lecture
<:DayOfWeek WeekDay>
<:Date [IF-ADDED computeDayOfWeek]>
...)
The filler for the slot :DayOfWeek will be calculated when the slot
:Date is filled
25
Frames: inheritance (1)
¡ Procedures and fillers of more general frame are inherited
by more specific frame through the inheritance
mechanism
¡ Example
(CoffeeTable
<:IS-A Table>
...)
(MahoganyCoffeeTable
<:IS-A CoffeeTable>
...)
26
Frames: inheritance (2)
¡ Example:
(Elephant
<:IS-A Mammal>
<:Colour gray>
...)
(RoyalElephant
<:IS-A Elephant>
<:Colour white>
…)
(clyde
<:INSTANCE-OF RoyalElephant>
…)
27
Frames: reasoning
¡ Basic reasoning goes like this
¨ The user instantiates a frame, i.e., declares that an object or
situation exists.
¨ Slot fillers are inherited where possible.
¨ Inherited IF-ADDED procedures are run, which causes more
frames to be instantiated and slots to be filled.
¡ If the user or any procedure requires the filler of a slot, then:
¨ If there is a filler, it is used.
¨ Otherwise, an inherited IF-NEEDED procedure is run, which
potentially causes additional actions.
28
Frame-based representation: Advantages
¡ Combine procedural and declarative knowledge using
one knowledge representation scheme.
¡ Frames can be structured hierarchically, which allows easy
classification of knowledge.
¡ Reduce the complexity of knowledge base construction by
allowing a hierarchy of frames to be built up.
¡ Enable to constrain allowed values (i.e., to allow values to
be entered within a specific range).
¡ Allow to store default values (i.e., by IS-A slot, the
information of a more generic frame is used to
automatically fill slots in a more specific frame).
29
Frame-based representation: disadvantages
¡ Require (much) attention in the design stage to ensure that
suitable taxonomies, i.e., agreed structures for the
terminology are created for the system.
¡ Can lead to ‘procedural fever’, that is the apparent
requirement to focus on making appropriate procedures
rather than checking the overall structure and content of
the frames.
¡ Can be inefficient at runtime because frames do not
provide the most efficient method to store data in a
computer.