Chapter 7 Temporal-Difference learning

RyanLee_ljx...小于 1 分钟

Chapter 7 Temporal-Difference learning

TD learning refers a wide range of algorithms.

TD algorithm can solve Bellman equation of a given policy $\pi$ without model.

昵称

邮箱

网址

评论

按正序
按倒序
按热度

Powered by Waline v3.1.3