What are features for state-action pairs in RL?
-
01-11-2019 - |
Question
I read this answer: What are features in the context of reinforcement learning?
But it only describes features for the state only in the context of cartpole, ie. Cart Position, Cart Velocity, Pole Angle, Pole Velocity At Tip
On slide 18 here: http://www.cs.cmu.edu/~rsalakhu/10703/Lecture_VFA.pdf
It states:
But does not give examples. I started reading from p. 198 in Sutton's book for Value Function Approximation but also did not see examples for "features of state-action pairs" .
My best guess is for example in Cartpole-V1 (discrete action space) would be to add one more number to the tuple describing the state-action pair, ie. (Cart Position, Cart Velocity, Pole Angle, Pole Velocity At Tip, push_right) . In the case of Cartpole I guess each state action pair could be described with a feature vector of length 3 where the final input for the tuple is either "push_left", "do_nothing", "push_right".
Would the immediate reward from taking one of the actions also be included in the tuples that form the state-action feature vector?
No correct solution