Thanks to visit codestin.com Credit goes to github.com
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent c93f818 commit 7967097Copy full SHA for 7967097
mdp.py
@@ -12,7 +12,7 @@ class MDP:
12
"""A Markov Decision Process, defined by an initial state, transition model,
13
and reward function. We also keep track of a gamma value, for use by
14
algorithms. The transition model is represented somewhat differently from
15
- the text. Instead of T(s, a, s') being probability number for each
+ the text. Instead of T(s, a, s') being a probability number for each
16
state/action/state triplet, we instead have T(s, a) return a list of (p, s')
17
pairs. We also keep track of the possible states, terminal states, and
18
actions for each state. [page 615]"""
0 commit comments