Revisiting Approximate Dynamic Programming and its
2016-08-23
8 0 1
4.0
Other
Earn points
Value iteration based approximate/adaptive dynamic programming (ADP) as an approximate solution to infinitehorizon optimal control problems with deterministic dynamics and continuous state and action spaces is investigated. The learning iterations are decomposed into an outer loop and an inner loop. A relatively simple proof for the convergence of the outer-loop iterations to the optimal solution is provided using a novel idea with some new features. It presents an analogy between the value function during the iterations and the value function of a fixed-final-time optimal control problem. The inner loop is utilized to avoid the need for solving a set of nonlinear equations or a nonlinear optimization problem numerically, at each iteration of ADP for the policy update. Sufficient conditions for the uniqueness of the solution to the policy update equation and for the convergence of the inner-loop iterations to the solution are obtained. Afterwards, the results are formed as a learning a
matlab
规划
及其
动态
近似
回顾
收敛
Related Source Codes
GMSK Linear Receiver
0
0
no vote
NSGA-II algorithm
0
0
no vote
NSGA-III multi-objective optimization algorithm
0
0
no vote
Compressed sensing example
0
0
no vote
CFAR detector example
0
0
no vote
No comment