Lecture 5 | Notion

Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 5

We need to be able to generalize from our experience to make “good decisions”

Value Function Approximation(VFA)

from now on, we will represent $(s,a)$ value function with parameterized function

input would be state or state-action pair, output would be value in any kinds.

parameter $w$ here would a vector in simple terms such as DNN parameters.

Motivations

we don’t want to store and go through every single state’s properties
we want more compact and precise, generalized representation

Benefits of Generalization

Reduce required memory
Reduce computation
Reduce experience(numbers)

<aside> 💡 if representation is small, easy to learn to fit it. However it’s very likely that such representation will not display great capacity of representation. → Trade-Off

</aside>

Function Approximators

Out of so many possible approximators(NN, Linear combination, Fourier, etc ...) we will focus on differentiable kinds.

Today we will focus on Linear feature representations