AI with Python for Beginners
AI with Python for Beginners
Artificial Intelligence
with Python
Artificial Intelligence
O
Search X X
O X
P→Q
Knowledge P
Q
Uncertainty
Optimization
Inbox
Learning
Spam
Neural
Networks
NP
NP PP
Language
ADJ N P N
artificial with
intelligence python
Search
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
Search Problems
agent
entity that perceives its environment
and acts upon that environment
state
a configuration of the agent and
its environment
2 4 5 7 12 9 4 2 15 4 10 3
8 3 1 11 8 7 3 14 13 1 11 12
14 6 10 1 6 11 9 5 14 7
9 13 15 12 5 13 10 15 6 8 2
initial state
the state in which the agent begins
initial state 2 4 5 7
8 3 1 11
14 6 10
9 13 15 12
actions
choices that can be made in a state
actions
ACTIONS(s) returns the set of actions that
can be executed in state s
1 2
actions
3
4
transition model
a description of what state results from
performing any applicable action in any
state
transition model
RESULT(s, a) returns the state resulting from
performing action a in state s
2 4 5 7 2 4 5 7
8 3 1 11 8 3 1 11
RESULT( , )=
14 6 10 12 14 6 10 12
9 13 15 9 13 15
2 4 5 7 2 4 5 7
8 3 1 11 8 3 1 11
RESULT( , )=
14 6 10 12 14 6 10
9 13 15 9 13 15 12
transition model
2 4 5 7 2 4 5 7
8 3 1 11 8 3 1 11
RESULT( , )=
14 6 10 12 14 6 10
9 13 15 9 13 15 12
state space
the set of all states reachable from the
initial state by any sequence of actions
2 4 5 7
8 3 1 11
14 6 10 12
2 4 5 7 9 13 15 2 4 5 7
8 3 1 11 8 3 1 11
14 6 10 12 14 6 10
9 13 15 9 13 15 12
2 4 5 7 2 4 5 7 2 4 5 7 2 4 5 7
8 3 1 11 8 3 1 11 8 3 1 11 8 3 1
14 6 10 12 14 6 12 14 6 10 14 6 10 11
9 13 15 9 13 10 15 9 13 15 12 9 13 15 12
goal test
way to determine whether a given state
is a goal state
path cost
numerical cost associated with a given path
A
B
C D
E F G
I K
H J
L
M
A 4
2 B
5 2
C
1 D 6
E F G
3 2 3
I 4 K 3
H 4 J 2
1
2 L
M
A 1
1 B
1 1
C
1 D 1
E F G
1 1 1
I 1 K 1
H 1 J 1
1
1 L
M
Search Problems
• initial state
• actions
• transition model
• goal test
• path cost function
solution
a sequence of actions that leads from the
initial state to a goal state
optimal solution
a solution that has the lowest path cost
among all solutions
node
a data structure that keeps track of
- a state
- a parent (node that generated this node)
- an action (action applied to parent to get node)
- a path cost (from initial state to node)
Approach
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
E D
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
Find a path from A to E. A
B
Frontier
C D
• Start with a frontier that contains the initial state.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier. E
• If node contains goal state, return the solution. F
• Expand node, add resulting nodes to the frontier.
What could go wrong?
Find a path from A to E. A
B
Frontier
C D
E
F
Find a path from A to E. A
B
Frontier
C D
E
F
Find a path from A to E. A
B
Frontier
C D
E
F
Find a path from A to E. A
B
Frontier
C D
E
F
Find a path from A to E. A
B
Frontier
C D
E
F
Find a path from A to E. A
B
Frontier
A C D
C D
E
F
Find a path from A to E. A
B
Frontier
C D
C D
E
F
Revised Approach
• Start with a frontier that contains the initial state.
• Start with an empty explored set.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier.
• If node contains goal state, return the solution.
• Add the node to the explored set.
• Expand node, add resulting nodes to the frontier if they
aren't already in the frontier or the explored set.
Revised Approach
• Start with a frontier that contains the initial state.
• Start with an empty explored set.
• Repeat:
• If the frontier is empty, then no solution.
• Remove a node from the frontier.
• If node contains goal state, return the solution.
• Add the node to the explored set.
• Expand node, add resulting nodes to the frontier if they
aren't already in the frontier or the explored set.
stack
last-in first-out data type
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B
E
F
Find a path from A to E. A
B
Frontier
C D
C D
Explored Set
A B
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B D
E
F
Find a path from A to E. A
B
Frontier
C F
C D
Explored Set
A B D
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B D F
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B D F C
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B D F C
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B D F C
E
F
Depth-First Search
depth-first search
search algorithm that always expands the
deepest node in the frontier
Breadth-First Search
breadth-first search
search algorithm that always expands the
shallowest node in the frontier
queue
first-in first-out data type
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B
E
F
Find a path from A to E. A
B
Frontier
C D
C D
Explored Set
A B
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B C
E
F
Find a path from A to E. A
B
Frontier
D E
C D
Explored Set
A B C
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B C D
E
F
Find a path from A to E. A
B
Frontier
E F
C D
Explored Set
A B C D
E
F
Find a path from A to E. A
B
Frontier
C D
Explored Set
A B C D
E
F
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Depth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
Breadth-First Search
A
uninformed search
search strategy that uses no problem-
specific knowledge
informed search
search strategy that uses problem-specific
knowledge to find solutions more efficiently
greedy best-first search
search algorithm that expands the node
that is closest to the goal, as estimated by a
heuristic function h(n)
Heuristic function?
A
Heuristic function?
A
Heuristic function? Manhattan distance.
A
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
11 9 7 3 2 B
12 10 8 7 6 4 1
13 12 11 9 7 6 5 2
13 10 8 6 3
14 13 12 11 9 7 6 5 4
13 10
A 16 15 14 11 10 9 8 7 6
Greedy Best-First Search
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
A 16 15 14 12 11 10 9 8 7 6
Greedy Best-First Search
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
A 16 15 14 12 11 10 9 8 7 6
Greedy Best-First Search
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
A 16 15 14 12 11 10 9 8 7 6
A* search
search algorithm that expands node with
lowest value of g(n) + h(n)
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
A 16 15 14 12 11 10 9 8 7 6
A* Search
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
A 1+16 15 14 12 11 10 9 8 7 6
A* Search
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
A 1+16 2+15 14 12 11 10 9 8 7 6
A* Search
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
12 10 9 8 7 6 5 4 2
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
12 7+10 9 8 7 6 5 4 2
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
12 7+10 8+9 8 7 6 5 4 2
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 14+5 3
14 13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 14+5 3
14 6+13 5+12 10 9 8 7 6 4
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 14+5 3
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
13 6+11 14+5 3
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
11 1
4+13 11 5
10 9 8 7 6 5 4 3 2 1 B
10+11 1
4+13 11 5
11+10 9 8 7 6 5 4 3 2 1 B
10+11 1
4+13 11 5
11+10 12+9 8 7 6 5 4 3 2 1 B
10+11 1
4+13 11 5
4+13 11 5
4+13 11 5
4+13 11 5
4+13 11 5
4+13 11 5
4+13 11 5
4+13 11 5
11+10 12+9 13+8 14+7 15+6 16+5 17+4 18+3 19+2 20+1 B
10+11 1
4+13 11 5
-1 0 1
Minimax
PLAYER( )= X
PLAYER( X )= O
ACTIONS(s)
X O O
ACTIONS( O X X )={ ,
O
}
X O
RESULT(s, a)
X O O X O
O
RESULT( O X X , )= O X X
X O X O
TERMINAL(s)
O
TERMINAL( O X ) = false
X O X
O X
TERMINAL( O X ) = true
X O X
UTILITY(s)
O X
UTILITY( O X )= 1
X O X
O X X
UTILITY( X O ) = -1
O X O
O X O
O X X
X X O
VALUE: 1
MIN-VALUE: X O
PLAYER(s) = O 0 O X X
X O
O X O X O
MAX-VALUE: MAX-VALUE:
1 O X X 0 O X X
X O X O O
O X O X X O
VALUE: O X X VALUE: O X X
1 0
X X O X O O
MIN-VALUE: X O
PLAYER(s) = O 0 O X X
X O
O X O X O
MAX-VALUE: MAX-VALUE:
1 O X X 0 O X X
X O X O O
O X O X X O
VALUE: O X X VALUE: O X X
1 0
X X O X O O
MAX-VALUE:
1
X O
PLAYER(s) = X O X
X O
O X O X O X X O X X O
MAX-VALUE: MAX-VALUE: VALUE: MAX-VALUE:
1 O X X 0 O X X -1 O X O 0 O X
X O X O O X O X O O
O X O X X O X X O
VALUE: O X X VALUE: O X X VALUE: O X X
1 0 0
X X O X O O X O O
9
5 3 9
8
9 8
5 3 9 2 8
Minimax
• Given a state s:
• MAX picks action a in ACTIONS(s) that produces
highest value of MIN-VALUE(RESULT(s, a))
• MIN picks action a in ACTIONS(s) that produces
smallest value of MAX-VALUE(RESULT(s, a))
Minimax
function MAX-VALUE(state):
if TERMINAL(state):
return UTILITY(state)
v = -∞
for action in ACTIONS(state):
v = MAX(v, MIN-VALUE(RESULT(state, action)))
return v
Minimax
function MIN-VALUE(state):
if TERMINAL(state):
return UTILITY(state)
v=∞
for action in ACTIONS(state):
v = MIN(v, MAX-VALUE(RESULT(state, action)))
return v
Optimizations
4
4 5 3 2
4 8 5 9 3 7 2 4 6
4
4 5 ≤3 ≤2
4 8 5 9 3 2
Alpha-Beta Pruning
255,168
total possible Tic-Tac-Toe games
288,000,000,000
total possible chess games
after four moves each
29000
10
total possible chess games
(lower bound)
Depth-Limited Minimax
evaluation function
function that estimates the expected utility
of the game from a given state
https://xkcd.com/832/
Search
Introduction to
Artificial Intelligence
with Python
Introduction to
Artificial Intelligence
with Python
Knowledge
knowledge-based agents
agents that reason by operating on
internal representations of knowledge
If it didn't rain, Harry visited Hagrid today.
It rained today.
Logic
sentence
an assertion about the world
in a knowledge representation language
Propositional Logic
Proposition Symbols
P Q R
Logical Connectives
¬ ∧ ∨
not and or
→ ↔
implication biconditional
Not (¬)
P ¬P
false true
true false
And (∧)
P Q P∧Q
false false false
P Q P∨Q
false false false
P Q P→Q
false false true
P Q P↔Q
false false true
{P = true, Q = false}
knowledge base
a set of sentences known by a
knowledge-based agent
Entailment
α⊨β
It rained today.
inference
the process of deriving new sentences
from old ones
P: It is a Tuesday.
Q: It is raining.
R: Harry will go for a run.
KB: (P ∧ ¬Q) → R P ¬Q
Inference: R
Inference Algorithms
Does
KB ⊨ α
?
Model Checking
Model Checking
• To determine if KB ⊨ α:
• Enumerate all possible models.
• If in every model where KB is true, α is true, then
KB entails α.
• Otherwise, KB does not entail α.
P: It is a Tuesday. Q: It is raining. R: Harry will go for a run.
KB: (P ∧ ¬Q) → R P ¬Q
Query: R
P Q R KB
false false false
false false true
false true false
false true true
true false false
true false true
true true false
true true true
P: It is a Tuesday. Q: It is raining. R: Harry will go for a run.
KB: (P ∧ ¬Q) → R P ¬Q
Query: R
P Q R KB
false false false false
false false true false
false true false false
false true true false
true false false false
true false true true
true true false false
true true true false
P: It is a Tuesday. Q: It is raining. R: Harry will go for a run.
KB: (P ∧ ¬Q) → R P ¬Q
Query: R
P Q R KB
false false false false
false false true false
false true false false
false true true false
true false false false
true false true true
true true false false
true true true false
Knowledge Engineering
Clue
Clue
People Rooms Weapons
Col. Mustard Ballroom Knife
Prof.
KnifePlum
Ballroom
Clue
Propositional Symbols
¬plum
¬mustard ∨ ¬library ∨ ¬revolver
Logic Puzzles
(MinervaRavenclaw → ¬GilderoyRavenclaw)
(GilderoyGryffindor ∨ GilderoyRavenclaw)
Mastermind
4
Inference Rules
Modus Ponens
If it is raining, then Harry is inside.
It is raining.
Harry is inside.
Modus Ponens
α→ β
α
β
And Elimination
α∧β
α
Double Negation Elimination
¬(¬α)
α
Implication Elimination
α→ β
¬α ∨ β
Biconditional Elimination
α↔ β
(α → β) ∧ (β → α)
De Morgan's Law
¬(α ∧ β)
¬α ∨ ¬β
De Morgan's Law
¬(α ∨ β)
¬α ∧ ¬β
Distributive Property
(α ∧ (β ∨ γ))
(α ∧ β) ∨ (α ∧ γ)
Distributive Property
(α ∨ (β ∧ γ))
(α ∨ β) ∧ (α ∨ γ)
Search Problems
• initial state
• actions
• transition model
• goal test
• path cost function
Theorem Proving
Q
P ∨ Q1 ∨ Q2 ∨ ...∨ Qn
¬P
Q1 ∨ Q2 ∨ ...∨ Qn
(Ron is in the Great Hall) ∨ (Hermione is in the library)
Q∨R
P ∨ Q1 ∨ Q2 ∨ ...∨ Qn
¬P ∨ R1 ∨ R2 ∨ ...∨ Rm
Q1 ∨ Q2 ∨ ...∨ Qn ∨ R1 ∨ R2 ∨ ...∨ Rm
clause
a disjunction of literals
e.g. P ∨ Q ∨ R
conjunctive normal form
logical sentence that is a conjunction of
clauses
e.g. (A ∨ B ∨ C) ∧ (D ∨ ¬E) ∧ (F ∨ G)
Conversion to CNF
• Eliminate biconditionals
• turn (α ↔ β) into (α → β) ∧ (β → α)
• Eliminate implications
• turn (α → β) into ¬α ∨ β
• Move ¬ inwards using De Morgan's Laws
• e.g. turn ¬(α ∧ β) into ¬α ∨ ¬β
• Use distributive law to distribute ∨ wherever possible
Conversion to CNF
(P ∨ Q) → R
¬(P ∨ Q) ∨ R eliminate implication
(Q ∨ R)
P∨Q∨S
¬P ∨ R ∨ S
(Q ∨ S ∨ R ∨ S)
P∨Q∨S
¬P ∨ R ∨ S
(Q ∨ R ∨ S)
P
¬P
()
Inference by Resolution
• To determine if KB ⊨ α:
• Check if (KB ∧ ¬α) is a contradiction?
• If so, then KB ⊨ α.
• Otherwise, no entailment.
Inference by Resolution
• To determine if KB ⊨ α:
• Convert (KB ∧ ¬α) to Conjunctive Normal Form.
• Keep checking to see if we can use resolution to
produce a new clause.
• If ever we produce the empty clause (equivalent
to False), we have a contradiction, and KB ⊨ α.
• Otherwise, if we can't add new clauses, no
entailment.
Inference by Resolution
Does (A ∨ B) ∧ (¬B ∨ C) ∧ (¬C) entail A?
BelongsTo(Minerva, Gryffindor)
Minerva belongs to Gryffindor.
Universal Quantification
Universal Quantification
P(ω)
P(ω)
0 ≤ P(ω) ≤ 1
0∑≤ P(ω) = 1
ω∈Ω
1 1 1 1 1 1
6 6 6 6 6 6
1
P( ) = 1/6
6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
5 6 7 8 9 10
6 7 8 9 10 11
7 8 9 10 11 12
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
5 6 7 8 9 10
6 7 8 9 10 11
7 8 9 10 11 12
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
5 6 7 8 9 10
6 7 8 9 10 11
7 8 9 10 11 12
1
P(sum to 12) =
36
6 1
P(sum to 7) = =
36 6
unconditional probability
degree of belief in a proposition
in the absence of any other evidence
conditional probability
degree of belief in a proposition
given some evidence that has already
been revealed
conditional probability
P(a | b)
P(rain today | rain yesterday)
P(route change | traffic conditions)
P(disease | test results)
P(a ∧ b)
P(a | b) =
P(b)
P(sum 12 | )
1
P( ) =
6
1
P(sum 12) =
36
1
1 P(sum 12 | ) =
P( ) = 6
6
P(a ∧ b)
P(a | b) =
P(b)
P(a ∧ b) = P(b)P(a | b)
P(a ∧ b) = P(a)P(b | a)
random variable
a variable in probability theory with a
domain of possible values it can take on
random variable
Roll
{1, 2, 3, 4, 5, 6}
random variable
Weather
P(b) P(a | b)
P(b | a) =
P(a)
Bayes' Rule
P(a | b) P(b)
P(b | a) =
P(a)
AM PM
(.8)(.1)
=
.4
= 0.2
Knowing
we can calculate
we can calculate
we can calculate
we can calculate
PM
AM
R = rain R = ¬rain
C = cloud 0.08 0.32
C = ¬cloud 0.02 0.58
P(C | rain)
P(C, rain)
P(C | rain) = = αP(C, rain)
P(rain)
R = rain R = ¬rain
C = cloud 0.08 0.32
C = ¬cloud 0.02 0.58
Probability Rules
Negation
P( ¬a) = 1 − P(a)
Inclusion-Exclusion
∑
P(X = xi) = P(X = xi, Y = yj)
j
Marginalization
R = rain R = ¬rain
C = cloud 0.08 0.32
C = ¬cloud 0.02 0.58
P(C = cloud)
= P(C = cloud, R = rain) + P(C = cloud, R = ¬rain)
= 0.08 + 0.32
= 0.40
Conditioning
∑
P(X = xi) = P(X = xi | Y = yj)P(Y = yj)
j
Bayesian Networks
Bayesian network
data structure that represents the
dependencies among random variables
Bayesian network
• directed graph
• each node represents a random variable
• arrow from X to Y means X is a parent of Y
• each node X has probability distribution
P(X | Parents(X))
Rain
{none, light, heavy}
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
Rain none light heavy
{none, light, heavy} 0.7 0.2 0.1
Rain
{none, light, heavy}
R yes no
R M on time delayed
Maintenance none yes 0.8 0.2
{yes, no} none no 0.9 0.1
light yes 0.6 0.4
light no 0.7 0.3
Train heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Maintenance
{yes, no}
Train
{on time, delayed}
T attend miss
Appointment on time 0.9 0.1
{attend, miss} delayed 0.6 0.4
Rain
{none, light, heavy}
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
Rain Computing Joint Probabilities
{none, light, heavy}
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
P(light)
P(light)
Rain Computing Joint Probabilities
{none, light, heavy}
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
P(light, no)
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
P(light, no, delayed)
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
P(light, no, delayed, miss)
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
R = none
Rain
{none, light, heavy}
R yes no
Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Maintenance R = none
{yes, no}
M = yes
T = on time
Train A = attend
{on time, delayed}
T attend miss
Appointment on time 0.9 0.1
{attend, miss} delayed 0.6 0.4
R = none
M = yes
T = on time
A = attend
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend
Maintenance
{yes, no}
Train
{on time, delayed}
Appointment
{attend, miss}
R = light
T = on time
R yes no
Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Maintenance R = light
{yes, no}
M = yes
T = on time
Train A = attend
{on time, delayed}
T attend miss
Appointment on time 0.9 0.1
{attend, miss} delayed 0.6 0.4
Rain R = light
{none, light, heavy} M = yes
T = on time
A = attend
Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Rain R = light
{none, light, heavy} M = yes
T = on time
A = attend
Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Uncertainty over Time
Xt: Weather at time t
Markov assumption
the assumption that the current state
depends on only a finite fixed number of
previous states
Markov Chain
Markov chain
a sequence of random variables where the
distribution of each variable follows the
Markov assumption
Transition Model
Tomorrow (Xt+1)
0.8 0.2
Today (Xt)
0.3 0.7
X0 X1 X2 X3 X4
Sensor Models
Hidden State Observation
robot's position robot's sensor data
weather umbrella
Hidden Markov Models
Hidden Markov Model
a Markov model for a system with hidden
states that generate some observed event
Sensor Model
Observation (Et)
0.2 0.8
State (Xt)
0.9 0.1
sensor Markov assumption
the assumption that the evidence variable
depends only the corresponding state
X0 X1 X2 X3 X4
E0 E1 E2 E3 E4
Task Definition
given observations from start until now,
filtering calculate distribution for current state
given observations from start until now,
prediction calculate distribution for a future state
given observations from start until now,
smoothing calculate distribution for past state
most likely given observations from start until now,
explanation calculate most likely sequence of states
Uncertainty
Introduction to
Artificial Intelligence
with Python
Introduction to
Artificial Intelligence
with Python
Optimization
optimization
choosing the best option from a set of
options
local search
search algorithms that maintain a single
node and searches by moving to a
neighboring node
B
A
B
Cost: 17
state-space landscape
objective global maximum
function
cost global minimum
function
current state
neighbors
Hill Climbing
Hill Climbing
function HILL-CLIMB(problem):
current = initial state of problem
repeat:
neighbor = highest valued neighbor of current
if neighbor not better than current:
return current
current = neighbor
Cost: 17
Cost: 17
Cost: 17
Cost: 15
Cost: 13
Cost: 11
Cost: 9
global maximum
local maxima
global minimum
local minima
flat local maximum
shoulder
Hill Climbing Variants
Variant Definition
steepest-ascent choose the highest-valued neighbor
• Simplex
• Interior-Point
Constraint Satisfaction
Student:
4
Student: Taking classes:
1 A B C
2 B D E
3 C E F
4 E F G
Student: Taking classes: Exam slots:
1 A B C Monday
Tuesday
2 B D E Wednesday
3 C E F
4 E F G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
1 A B C
B C
2 B D E
D F
3 C E F
E
4 E F G
G
A
B C
D F
E G
Constraint Satisfaction Problem
B C
Domains
{Monday, Tuesday, Wednesday}
D F for each variable
Constraints
E {A≠B, A≠C, B≠C, B≠D, B≠E, C≠E,
G
C≠F, D≠E, E≠F, E≠G, F≠G}
hard constraints
constraints that must be satisfied in a
correct solution
soft constraints
constraints that express some notion of
which solutions are preferred over others
A
B C
D F
E G
unary constraint
constraint involving only one variable
unary constraint
{A ≠ Monday}
binary constraint
constraint involving two variables
binary constraint
{A ≠ B}
node consistency
when all the values in a variable's domain
satisfy the variable's unary constraints
A B
{Tue} {Wed}
{Tue} {Wed}
B C
D F
E G
{Mon, Tue, Wed}
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Search Problems
• initial state
• actions
• transition model
• goal test
• path cost function
CSPs as Search Problems
• initial state: empty assignment (no variables)
• actions: add a {variable = value} to assignment
• transition model: shows how adding an assignment
changes the assignment
• goal test: check if all variables assigned and
constraints all satisfied
• path cost function: all paths have same cost
Backtracking Search
Backtracking Search
function BACKTRACK(assignment, csp):
if assignment complete: return assignment
var = SELECT-UNASSIGNED-VAR(assignment, csp)
for value in DOMAIN-VALUES(var, assignment, csp):
if value consistent with assignment:
add {var = value} to assignment
result = BACKTRACK(assignment, csp)
if result ≠ failure: return result
remove {var = value} from assignment
return failure
{Mon, Tue, Wed}
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
Mon {Mon, Tue, Wed}
Mon
A
E G
Mon {Mon, Tue, Wed}
Mon
A
E G
Tue {Mon, Tue, Wed}
Mon
A
E G
Tue {Mon, Tue, Wed}
Mon
A
E G
Wed {Mon, Tue, Wed}
Mon
A
Tue B C Mon
E G
Wed {Mon, Tue, Wed}
Mon
A
Tue B C Mon
E G
Wed {Mon, Tue, Wed}
Mon
A
Tue B C Tue
E G
Wed {Mon, Tue, Wed}
Mon
A
Tue B C Tue
E G
Wed {Mon, Tue, Wed}
Mon
A
Tue B C Wed
E G
Wed {Mon, Tue, Wed}
Mon
A
Tue B C Wed
E G
Wed {Mon, Tue, Wed}
Mon
A
E G
Wed {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Mon
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Mon
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Tue
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Tue
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Wed
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Wed
Wed D F Mon
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Wed
Wed D F Mon
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon {Mon, Tue, Wed}
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Mon
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Mon
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Tue
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Tue
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Wed
Inference
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
Tue B C {Wed}
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
Tue B C {Wed}
E G
{Mon} {Mon, Tue, Wed}
Mon
A
Tue B C {Wed}
E G
{Mon} {Mon, Tue, Wed}
Mon
A
Tue B C {Wed}
{Wed} D F {Tue}
E G
{Mon} {Mon, Tue, Wed}
Mon
A
Tue B C {Wed}
{Wed} D F {Tue}
E G
{Mon} {Wed}
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Wed
maintaining arc-consistency
algorithm for enforcing arc-consistency
every time we make a new assignment
maintaining arc-consistency
When we make a new assignment to X, calls
AC-3, starting with a queue of all arcs (Y, X)
where Y is a neighbor of X
function BACKTRACK(assignment, csp):
if assignment complete: return assignment
var = SELECT-UNASSIGNED-VAR(assignment, csp)
for value in DOMAIN-VALUES(var, assignment, csp):
if value consistent with assignment:
add {var = value} to assignment
inferences = INFERENCE(assignment, csp)
if inferences ≠ failure: add inferences to assignment
result = BACKTRACK(assignment, csp)
if result ≠ failure: return result
remove {var = value} and inferences from assignment
return failure
function BACKTRACK(assignment, csp):
if assignment complete: return assignment
var = SELECT-UNASSIGNED-VAR(assignment, csp)
for value in DOMAIN-VALUES(var, assignment, csp):
if value consistent with assignment:
add {var = value} to assignment
inferences = INFERENCE(assignment, csp)
if inferences ≠ failure: add inferences to assignment
result = BACKTRACK(assignment, csp)
if result ≠ failure: return result
remove {var = value} and inferences from assignment
return failure
function BACKTRACK(assignment, csp):
if assignment complete: return assignment
var = SELECT-UNASSIGNED-VAR(assignment, csp)
for value in DOMAIN-VALUES(var, assignment, csp):
if value consistent with assignment:
add {var = value} to assignment
inferences = INFERENCE(assignment, csp)
if inferences ≠ failure: add inferences to assignment
result = BACKTRACK(assignment, csp)
if result ≠ failure: return result
remove {var = value} and inferences from assignment
return failure
SELECT-UNASSIGNED-VAR
Tue B C {Wed}
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
Mon
A
Tue B C {Wed}
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
{Mon, Tue, Wed}
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
{Mon, Tue, Wed}
A
E G
{Mon, Tue, Wed} {Mon, Tue, Wed}
function BACKTRACK(assignment, csp):
if assignment complete: return assignment
var = SELECT-UNASSIGNED-VAR(assignment, csp)
for value in DOMAIN-VALUES(var, assignment, csp):
if value consistent with assignment:
add {var = value} to assignment
inferences = INFERENCE(assignment, csp)
if inferences ≠ failure: add inferences to assignment
result = BACKTRACK(assignment, csp)
if result ≠ failure: return result
remove {var = value} and inferences from assignment
return failure
function BACKTRACK(assignment, csp):
if assignment complete: return assignment
var = SELECT-UNASSIGNED-VAR(assignment, csp)
for value in DOMAIN-VALUES(var, assignment, csp):
if value consistent with assignment:
add {var = value} to assignment
inferences = INFERENCE(assignment, csp)
if inferences ≠ failure: add inferences to assignment
result = BACKTRACK(assignment, csp)
if result ≠ failure: return result
remove {var = value} and inferences from assignment
return failure
DOMAIN-VALUES
E G
{Mon, Tue, Wed} Wed
Mon
A
E G
{Mon, Tue, Wed} Wed
Mon
A
Tue B C Wed
Wed D F Tue
E G
Mon Wed
Problem Formulation
A
B C
50x1 + 80x2
5x1 + 2x2 ≤ 20 D F
(−10x1) + (−12x2) ≤ − 90
E G
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
nearest-neighbor classification
algorithm that, given an input, chooses the
class of the nearest data point to that input
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
k-nearest-neighbor classification
algorithm that, given an input, chooses the
most common class out of the k nearest
data points to that input
pressure
humidity
pressure
humidity
x1 = Humidity
x2 = Pressure
1 if w0 + w1x1 + w2x2 ≥ 0
h(x1, x2) =
0 otherwise
Weight Vector w: (w0, w1, w2)
Input Vector x: (1, x1, x2)
w · x: w0 + w1x1 + w2x2
hw(x) = 1 if w · x ≥ 0
0 otherwise
perceptron learning rule
Given data point (x, y), update each weight
according to:
wi = wi + α(y - hw(x)) × xi
perceptron learning rule
Given data point (x, y), update each weight
according to:
0
1
w·x
pressure
humidity
pressure
humidity
hard threshold
1
output
0
w·x
soft threshold
1
output
0
w·x
Support Vector Machines
maximum margin separator
boundary that maximizes the distance
between any of the data points
regression
supervised learning task of learning a
function mapping an input point to a
continuous value
f(advertising)
f(1200) = 5800
f(2800) = 13400
f(1800) = 8400
h(advertising)
sales
advertising
Evaluating Hypotheses
loss function
function that expresses how poorly our
hypothesis performs
0-1 loss function
L(actual, predicted) =
0 if actual = predicted,
1 otherwise
pressure
humidity
0
0 0 0
0
0 1
0
0 0 0
1 0
pressure
0 0
0 0 0
0
1 0 0
0
0 0
0
1 0 0 0
humidity
L1 loss function
L(actual, predicted) = | actual - predicted |
sales
advertising
sales
advertising
L2 loss function
L(actual, predicted) = (actual - predicted) 2
overfitting
a model that fits too closely to a particular
data set and therefore may fail to generalize
to future data
pressure
humidity
pressure
humidity
pressure
humidity
sales
advertising
sales
advertising
penalizing hypotheses that are more complex
to favor simpler, more general hypotheses
cost(h) = loss(h)
penalizing hypotheses that are more complex
to favor simpler, more general hypotheses
Agent
Markov Decision Process
model for decision-making, representing
states, actions, and their rewards
Markov Decision Process
model for decision-making, representing
states, actions, and their rewards
Markov Chain
X0 X1 X2 X3 X4
r r r r
r r r r
r r r r
Markov Decision Process
• Set of states S
• Set of actions ACTIONS(s)
• Transition model P(s' | s, a)
• Reward function R(s, a, s')
Q-learning
method for learning a function Q(s, a),
estimate of the value of performing action a
in state s
Q-learning Overview
• Genetic research
• Image segmentation
• Market research
• Medical imaging
• Social network analysis.
k-means clustering
algorithm for clustering data based on
repeatedly assigning points to clusters and
updating those clusters' centers
Learning
• Supervised Learning
• Reinforcement Learning
• Unsupervised Learning
Learning
Introduction to
Artificial Intelligence
with Python
Introduction to
Artificial Intelligence
with Python
Neural Networks
Neural Networks
0
w·x
step function
1
g(x) = 1 if x ≥ 0, else 0
output
0
w·x
logistic sigmoid
1 x
e
g(x) = x
e +1
output
0
w·x
rectified linear unit (ReLU)
g(x) = max(0, x)
output
0
w·x
h(x1, x2) = g(w0 + w1x1 + w2x2)
h(x1, x2) = g(w0 + w1x1 + w2x2)
x1
x2
x1 w1
x2 w2
x1 w1
x2 w2
w0
x1 w1
x2 w2
x1 w1
x2 w2
Or
x y f(x, y)
0 0 0
0 1 1
1 0 1
1 1 1
w0
x1 w1
g(w0 + w1x1 + w2x2)
x2 w2
-1
x1 1
g(-1 + 1x1 + 1x2)
x2 1
-1 0 1
-1
x1 0 1
g(-1 + 1x1 + 1x2)
0
x2 0 1
-1 0 1
-1
x1 1 1
g(-1 + 1x1 + 1x2)
1
x2 0 1
-1 0 1
-1
x1 1 1
g(-1 + 1x1 + 1x2)
1
x2 1 1
-1 0 1
And
x y f(x, y)
0 0 0
0 1 0
1 0 0
1 1 1
-1
x1 1
g(-1 + 1x1 + 1x2)
x2 1
-1 0 1
-2
x1 1
g(-2 + 1x1 + 1x2)
x2 1
-1 0 1
-2
x1 1 1
g(-2 + 1x1 + 1x2)
1
x2 1 1
-1 0 1
-2
x1 1 1
g(-2 + 1x1 + 1x2)
0
x2 0 1
-1 0 1
humidity
probability
of rain
pressure
advertising
sales
month
…
x1 w1
x2 w2
x1 w1
x3
x1
w1
x2 w2
5
w3
∑
x3 g( xiwi + w0)
w4
i=1
x4 w5
x5
x1
w1
x2 w2
n
∑
… g( xiwi + w0)
wn-1 i=1
xn-1 wn
xn
gradient descent
algorithm for minimizing loss when training
neural network
Gradient Descent
sunny
cloudy
snowy
rainy
sunny
cloudy
snowy
rainy
sunny
cloudy
snowy
rainy
sunny
cloudy
snowy
0.1 rainy
0.6 sunny
0.2 cloudy
0.1 snowy
0.1 rainy
0.6 sunny
0.2 cloudy
0.1 snowy
action 1
action 2
action 3
action 4
Perceptron
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
0 255
...
...
image convolution
applying a filter that adds each pixel value
of an image to its neighbors, weighted
according to a kernel matrix
0 -1 0
-1 5 -1
0 -1 0
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50
20 30 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50
20 30 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50
20 30 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10
20 30 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10
20 30 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10 20
20 30 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10 20
20 30 40 50 40
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10 20
20 30 40 50 40
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10 20
20 30 40 50 40 50
0 -1 0
10 20 30 40
-1 5 -1
10 20 30 40 0 -1 0
20 30 40 50 10 20
20 30 40 50 40 50
-1 -1 -1
-1 8 -1
-1 -1 -1
-1 -1 -1
20 20 20 -1 8 -1
-1 -1 -1
20 20 20 (20)(-1) + (20)(-1) + (20)(-1)
+ (20)(-1) + (20)(8) + (20)(-1)
+ (20)(-1) + (20)(-1) + (20)(-1)
20 20 20
0
-1 -1 -1
20 20 20 -1 8 -1
-1 -1 -1
50 50 50 (20)(-1) + (20)(-1) + (20)(-1)
+ (50)(-1) + (50)(8) + (50)(-1)
+ (50)(-1) + (50)(-1) + (50)(-1)
50 50 50
90
pooling
reducing the size of an input by sampling
from regions in the input
max-pooling
pooling by choosing the maximum value in
each region
30 40 80 90
20 50 100 110
0 10 20 30
10 20 40 30
30 40 80 90
20 50 100 110
0 10 20 30
10 20 40 30
30 40 80 90
20 50 100 110
0 10 20 30
10 20 40 30
30 40 80 90
20 50 100 110 50
0 10 20 30
10 20 40 30
30 40 80 90
20 50 100 110 50
0 10 20 30
10 20 40 30
30 40 80 90
0 10 20 30
10 20 40 30
30 40 80 90
0 10 20 30
10 20 40 30
30 40 80 90
0 10 20 30 20
10 20 40 30
30 40 80 90
0 10 20 30 20
10 20 40 30
30 40 80 90
0 10 20 30 20 40
10 20 40 30
30 40 80 90
0 10 20 30 20 40
10 20 40 30
convolutional neural network
neural networks that use convolution,
usually for analyzing images
convolution pooling flattening
first second
convolution and pooling convolution and pooling
network output
network output
network output
input network
input network
input network
她在圖書館
input network
input network
input network
network output
network output
Neural Networks
Introduction to
Artificial Intelligence
with Python
Introduction to
Artificial Intelligence
with Python
Language
Natural Language Processing
Natural Language Processing
• automatic summarization
• information extraction
• language identification
• machine translation
• named entity recognition
• speech recognition
• text classification
• word sense disambiguation
• ...
Syntax
"Just before nine o'clock Sherlock
Holmes stepped briskly into the room."
"Just before Sherlock Holmes nine
o'clock stepped briskly the room."
"I saw the man on the mountain
with a telescope."
Semantics
"Just before nine o'clock Sherlock
Holmes stepped briskly into the room."
"Sherlock Holmes stepped briskly into
the room just before nine o'clock."
"A few minutes before nine, Sherlock
Holmes walked quickly into the room."
"Colorless green ideas sleep furiously."
Natural Language Processing
Syntax
formal grammar
a system of rules for generating sentences
in a language
Context-Free Grammar
N V D N
D → the | a | an | ...
P → to | on | over | ...
NP → N | D N N
she
NP
NP → N | D N D N
the city
VP → V | V NP
VP
VP → V | V NP V
walked
VP
V NP
VP → V | V NP
D N
NP VP
S → NP VP V NP
N D N
P(a | b) P(b)
P(b | a) =
P(a)
P(Positive)
P(Negative)
P(😀)
P(🙁)
"My grandson loved it!"
P(😀)
P(😀 | "my grandson loved it")
P(😀 | "my", "grandson", "loved", "it")
P(😀 | "my", "grandson", "loved", "it")
P(😀 | "my", "grandson", "loved", "it")
equal to
proportional to
proportional to
naively proportional to
P(😀)P("my" | 😀)P("grandson" | 😀)
P("loved" | 😀) P("it" | 😀)
number of positive samples
P(😀) = number of total samples
number of positive samples with "loved"
P("loved" | 😀) =
number of positive samples
P(😀)P("my" | 😀)P("grandson" | 😀)
P("loved" | 😀) P("it" | 😀)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(😀)P("my" | 😀)P("grandson" | 😀)
P("loved" | 😀) P("it" | 😀)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(😀)P("my" | 😀)P("grandson" | 😀)
P("loved" | 😀) P("it" | 😀)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(😀)P("my" | 😀)P("grandson" | 😀)
P("loved" | 😀) P("it" | 😀)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(🙁)P("my" | 🙁)P("grandson" | 🙁)
P("loved" | 🙁) P("it" | 🙁)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(🙁)P("my" | 🙁)P("grandson" | 🙁)
P("loved" | 🙁) P("it" | 🙁)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(🙁)P("my" | 🙁)P("grandson" | 🙁)
P("loved" | 🙁) P("it" | 🙁)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(🙁)P("my" | 🙁)P("grandson" | 🙁)
P("loved" | 🙁) P("it" | 🙁)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
it 0.30 0.40
P(🙁)P("my" | 🙁)P("grandson" | 🙁)
P("loved" | 🙁) P("it" | 🙁)
😀 🙁 😀 🙁
0.49 0.51 my 0.30 0.20
TotalDocuments
log
NumDocumentsContaining(word)
tf-idf
ranking of what words are important in a
document by multiplying term frequency
(TF) by inverse document frequency (IDF)
Semantics
information extraction
the task of extracting knowledge from
documents
"When Facebook was founded in 2004, it began with a seemingly
innocuous mission: to connect friends. Some seven years and 800
million users later, the social network has taken over most aspects of
our personal and professional lives, and is fast becoming the dominant
communication platform of the future."
Harvard Business Review, 2011
he [1, 0, 0, 0]
wrote [0, 1, 0, 0]
a [0, 0, 1, 0]
book [0, 0, 0, 1]
one-hot representation
representation of meaning as a vector with
a single 1, and with other values as 0
"He wrote a book."
he [1, 0, 0, 0]
wrote [0, 1, 0, 0]
a [0, 0, 1, 0]
book [0, 0, 0, 1]
"He wrote a book."
he [1, 0, 0, 0, 0, 0, 0, 0, ...]
wrote [0, 1, 0, 0, 0, 0, 0, ...]
a [0, 0, 1, 0, 0, 0, 0, 0, ...]
book [0, 0, 0, 1, 0, 0, 0, ...]
"He wrote a book."
"He authored a novel."
wrote [0, 1, 0, 0, 0, 0, 0, 0, 0]
authored [0, 0, 0, 0, 1, 0, 0, 0, 0]
book [0, 0, 0, 0, 0, 0, 1, 0, 0]
novel [0, 0, 0, 0, 0, 0, 0, 0, 1]
distribution representation
representation of meaning distributed
across multiple values
"He wrote a book."
lunch
dinner
novel
memoir
book
breakfast
novel
dinner lunch
king
man woman
king queen
man woman
Language
Artificial Intelligence
O
Search X X
O X
P→Q
Knowledge P
Q
Uncertainty
Optimization
Inbox
Learning
Spam
Neural
Networks
NP
NP PP
Language
ADJ N P N
artificial with
intelligence python
Introduction to
Artificial Intelligence
with Python