TY - GEN
T1 - Universal Complexity Bounds Based on Value Iteration and Application to Entropy Games
AU - Allamigeon, Xavier
AU - Gaubert, Stéphane
AU - Katz, Ricardo D.
AU - Skomra, Mateusz
N1 - Publisher Copyright:
© Xavier Allamigeon, Stéphane Gaubert, Ricardo D. Katz, and Mateusz Skomra; licensed under Creative Commons License CC-BY 4.0
PY - 2022/7/1
Y1 - 2022/7/1
N2 - We develop value iteration-based algorithms to solve in a unified manner different classes of combinatorial zero-sum games with mean-payoff type rewards. These algorithms rely on an oracle, evaluating the dynamic programming operator up to a given precision. We show that the number of calls to the oracle needed to determine exact optimal (positional) strategies is, up to a factor polynomial in the dimension, of order R/sep, where the “separation” sep is defined as the minimal difference between distinct values arising from strategies, and R is a metric estimate, involving the norm of approximate sub and super-eigenvectors of the dynamic programming operator. We illustrate this method by two applications. The first one is a new proof, leading to improved complexity estimates, of a theorem of Boros, Elbassioni, Gurvich and Makino, showing that turn-based mean payoff games with a fixed number of random positions can be solved in pseudo-polynomial time. The second one concerns entropy games, a model introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin. The rank of an entropy game is defined as the maximal rank among all the ambiguity matrices determined by strategies of the two players. We show that entropy games with a fixed rank, in their original formulation, can be solved in polynomial time, and that an extension of entropy games incorporating weights can be solved in pseudo-polynomial time under the same fixed rank condition.
AB - We develop value iteration-based algorithms to solve in a unified manner different classes of combinatorial zero-sum games with mean-payoff type rewards. These algorithms rely on an oracle, evaluating the dynamic programming operator up to a given precision. We show that the number of calls to the oracle needed to determine exact optimal (positional) strategies is, up to a factor polynomial in the dimension, of order R/sep, where the “separation” sep is defined as the minimal difference between distinct values arising from strategies, and R is a metric estimate, involving the norm of approximate sub and super-eigenvectors of the dynamic programming operator. We illustrate this method by two applications. The first one is a new proof, leading to improved complexity estimates, of a theorem of Boros, Elbassioni, Gurvich and Makino, showing that turn-based mean payoff games with a fixed number of random positions can be solved in pseudo-polynomial time. The second one concerns entropy games, a model introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin. The rank of an entropy game is defined as the maximal rank among all the ambiguity matrices determined by strategies of the two players. We show that entropy games with a fixed rank, in their original formulation, can be solved in polynomial time, and that an extension of entropy games incorporating weights can be solved in pseudo-polynomial time under the same fixed rank condition.
KW - Mean-payoff games
KW - Perron root
KW - entropy games
KW - parameterized complexity
KW - separation bounds
KW - value iteration
UR - https://www.scopus.com/pages/publications/85133470010
U2 - 10.4230/LIPIcs.ICALP.2022.110
DO - 10.4230/LIPIcs.ICALP.2022.110
M3 - Conference contribution
AN - SCOPUS:85133470010
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 49th EATCS International Conference on Automata, Languages, and Programming, ICALP 2022
A2 - Bojanczyk, Mikolaj
A2 - Merelli, Emanuela
A2 - Woodruff, David P.
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 49th EATCS International Conference on Automata, Languages, and Programming, ICALP 2022
Y2 - 4 July 2022 through 8 July 2022
ER -