Reinforcement Learning Part 3: The Bellman Optimality Equation and Optimal Policies
Those joining us directly at Part 3 should be familiar with the Bellman Equation and how it can be used to compare two policies. We have defined optimal policies and optimal state values. We also know how to calculate state values iteratively rather than calculating a computationally-heavy matrix inverse using a step known as Policy Evaluation. Those interested in previous parts can find the links at the bottom.
The Bellman Optimality Equation
The Bellman Optimality Equation (BOE) builds on the Bellman Equation and tries to express the state values that an agent can achieve...
The Bellman Optimality Equation
The Bellman Optimality Equation (BOE) builds on the Bellman Equation and tries to express the state values that an agent can achieve...