Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 5

We need to be able to generalize from our experience to make “good decisions”

Value Function Approximation(VFA)

from now on, we will represent $(s,a)$ value function with parameterized function

17BD504E-7B6F-4473-89F9-248993605F9E.jpeg

input would be state or state-action pair, output would be value in any kinds.

parameter $w$ here would a vector in simple terms such as DNN parameters.

Motivations

Benefits of Generalization

<aside> 💡 if representation is small, easy to learn to fit it. However it’s very likely that such representation will not display great capacity of representation. → Trade-Off

</aside>

Function Approximators

Out of so many possible approximators(NN, Linear combination, Fourier, etc ...) we will focus on differentiable kinds.

Today we will focus on Linear feature representations