pathwise derivative method: Stocastic Value Gradient(SVG), (Deep) Deterministic Policy Gradient(DPG/DDPG)

上へ