Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 5
We need to be able to generalize from our experience to make “good decisions”
from now on, we will represent $(s,a)$ value function with parameterized function
input would be state or state-action pair, output would be value in any kinds.
parameter $w$ here would a vector in simple terms such as DNN parameters.
Benefits of Generalization
<aside> 💡 if representation is small, easy to learn to fit it. However it’s very likely that such representation will not display great capacity of representation. → Trade-Off
Function Approximators
Out of so many possible approximators(NN, Linear combination, Fourier, etc ...) we will focus on differentiable kinds.
Today we will focus on Linear feature representations