TY - GEN
T1 - Solving multichain stochastic games with mean payoff by policy iteration
AU - Akian, Marianne
AU - Cochet-Terrasson, Jean
AU - Detournay, Sylvie
AU - Gaubert, Stéphane
PY - 2013/1/1
Y1 - 2013/1/1
N2 - Zero-sum stochastic games with finite state and action spaces, perfect information, and mean payoff criteria arise in particular from the monotone discretization of mean-payoff pursuit-evasion deterministic differential games. In that case no irreducibility assumption on the Markov chains associated to strategies are satisfied (multichain games). The value of such a game can be characterized by a system of nonlinear equations, involving the mean payoff vector and an auxiliary vector (relative value or bias). Cochet-Terrasson and Gaubert proposed in (C. R. Math. Acad. Sci. Paris, 2006) a policy iteration algorithm relying on a notion of nonlinear spectral projection (Akian and Gaubert, Nonlinear Analysis TMA, 2003), which allows one to avoid cycling in degenerate iterations. We give here a complete presentation of the algorithm, with details of implementation in particular of the nonlinear projection. This has led to the software PIGAMES and allowed us to present numerical results on pursuit-evasion games.
AB - Zero-sum stochastic games with finite state and action spaces, perfect information, and mean payoff criteria arise in particular from the monotone discretization of mean-payoff pursuit-evasion deterministic differential games. In that case no irreducibility assumption on the Markov chains associated to strategies are satisfied (multichain games). The value of such a game can be characterized by a system of nonlinear equations, involving the mean payoff vector and an auxiliary vector (relative value or bias). Cochet-Terrasson and Gaubert proposed in (C. R. Math. Acad. Sci. Paris, 2006) a policy iteration algorithm relying on a notion of nonlinear spectral projection (Akian and Gaubert, Nonlinear Analysis TMA, 2003), which allows one to avoid cycling in degenerate iterations. We give here a complete presentation of the algorithm, with details of implementation in particular of the nonlinear projection. This has led to the software PIGAMES and allowed us to present numerical results on pursuit-evasion games.
UR - https://www.scopus.com/pages/publications/84902327052
U2 - 10.1109/CDC.2013.6760149
DO - 10.1109/CDC.2013.6760149
M3 - Conference contribution
AN - SCOPUS:84902327052
SN - 9781467357173
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1834
EP - 1841
BT - 2013 IEEE 52nd Annual Conference on Decision and Control, CDC 2013
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 52nd IEEE Conference on Decision and Control, CDC 2013
Y2 - 10 December 2013 through 13 December 2013
ER -