Average Payoff Games
Download Average Payoff Games
Preview text
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average Payoff Games
Marcin Jurdzin´ski1 Ashutosh Trivedi1
1Department of Computer Science University of Warwick
22nd British Colloquium for Theoretical Computer Science BCTCS 2006
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
1 Introduction Average payoff games (discrete-time and continuous-time)
2 Average payoff games (discrete-time) Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
3 Average payoff games (continuous-time) Average payoff games (continuous-time) Average time games
4 Conclusion
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time and continuous-time)
Discrete-time average payoff games
2-player perfect information, zero-sum games on weighted finite automata. run = v1 −a→1 v2 −a→2 v3 . . . players optimize
reward per transition (mean-payoff games).
Continuous-time average payoff games
2-player perfect information, zero-sum games
on priced timed automata
run = v1 −a→1 v2 −a→2 v3 . . .
t1
t2
players optimize :
time per transition (average time game)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4 v2
Max Min
{x} 2
y<1
x>1 a
b {y} a
b
v1 1
b {y} 3 v3
a {x}
x>1 b
a y<1 4
v4
x
y
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time and continuous-time)
Discrete-time average payoff games
2-player perfect information, zero-sum games on weighted finite automata. run = v1 −a→1 v2 −a→2 v3 . . . players optimize
reward per transition (mean-payoff games).
Continuous-time average payoff games
2-player perfect information, zero-sum games
on priced timed automata
run = v1 −a→1 v2 −a→2 v3 . . .
t1
t2
players optimize :
payoff per transition (average price game)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4 v2
Max Min
{x} 2
y<1
x>1 a
b {y} a
b
v1 1
b {y} 3 v3
a {x}
x>1 b
a y<1 4
v4
x
y
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time and continuous-time)
Discrete-time average payoff games
2-player perfect information, zero-sum games on weighted finite automata. run = v1 −a→1 v2 −a→2 v3 . . . players optimize
reward per transition (mean-payoff games).
Continuous-time average payoff games
2-player perfect information, zero-sum games
on priced timed automata
run = v1 −a→1 v2 −a→2 v3 . . .
t1
t2
players optimize :
payoff per unit time (average price game)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4 v2
Max Min
{x} 2
y<1
x>1 a
b {y} a
b
v1 1
b {y} 3 v3
a {x}
x>1 b
a y<1 4
v4
x
y
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Average payoff games (discrete-time)
Game arena A finite directed graph G = {V, E} A partition of vertices V = VMax ∪ VMin A reward function r : V → Z.
Optimization goals π = v1 −a→1 v2 −a→2 v3 . . . A(π) = limn→∞(1/n) · ∑ni=1 r(vi)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4
Max Min
Strategies History dependent ΣMax = {V∗ × VMax → V} Positional ΣMax = {VMax → V}
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Value of the game A(v, χ, µ) = limiting average of the game. upper-value A∗(v) = infµ∈ΣMin supχ∈ΣMax A(v, µ, χ), lower value A∗(v) = supχ∈ΣMax infµ∈ΣMin A(v, µ, χ). (minimax) value of the game A∗(v) = A∗(v) = A(v, χ∗, µ∗).
Algorithmic problem Find the optimal strategy for Max and Min. Pseudo-polynomial time algorithm [Zwick and Paterson’95].
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Optimality equations
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Local witness of global optimality gain(v) = value of the game. bias(v) = measure of initial fluctuations.
Optimality equations OE For every max vertex v ∈ VMax,
g(v) = max{g(u) : (v, u) ∈ E}. b(v) = max{r(v) − g(v) + b(u) : (u, v) ∈ E and g(v) = g(u)}.
Positional optimal strategy Solution to OE gives optimal value of the game. Solution to OE provides positional optimal strategy.
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Optimality equations
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Local witness of global optimality gain(v) = value of the game. bias(v) = measure of initial fluctuations.
Optimality equations OE For every min vertex v ∈ VMin,
g(v) = min{g(u) : (v, u) ∈ E}. b(v) = min{r(v) − g(v) + b(u) : (u, v) ∈ E and g(v) = g(u)}.
Positional optimal strategy Solution to OE gives optimal value of the game. Solution to OE provides positional optimal strategy.
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Optimality equations – Contd.
Optimality equations
If solution for OE exists, then optimal-cycle(v) = Cycle reachable from vertex v, when players play optimal strategies. gain(v) = average of the optimal-cycle(v). bias(v) = optimal distance (fluctuations) from v to an arbitrary vertex in optimal-cycle(v).
g1 b1
v1 r1
g2 b2
v2 r2
g3 b3
v3 r3
g4 b4
v4 r4
g1 = g2 = g3 = g4 b1 = r1 − g + b2
=g
b2 = r2 − g + b3 b3 = r3 − g + b4 b4 = r4 − g + b2
Ashutosh Trivedi
Average Payoff Games
Conclusion Appendix
Average Payoff Games
Marcin Jurdzin´ski1 Ashutosh Trivedi1
1Department of Computer Science University of Warwick
22nd British Colloquium for Theoretical Computer Science BCTCS 2006
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
1 Introduction Average payoff games (discrete-time and continuous-time)
2 Average payoff games (discrete-time) Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
3 Average payoff games (continuous-time) Average payoff games (continuous-time) Average time games
4 Conclusion
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time and continuous-time)
Discrete-time average payoff games
2-player perfect information, zero-sum games on weighted finite automata. run = v1 −a→1 v2 −a→2 v3 . . . players optimize
reward per transition (mean-payoff games).
Continuous-time average payoff games
2-player perfect information, zero-sum games
on priced timed automata
run = v1 −a→1 v2 −a→2 v3 . . .
t1
t2
players optimize :
time per transition (average time game)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4 v2
Max Min
{x} 2
y<1
x>1 a
b {y} a
b
v1 1
b {y} 3 v3
a {x}
x>1 b
a y<1 4
v4
x
y
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time and continuous-time)
Discrete-time average payoff games
2-player perfect information, zero-sum games on weighted finite automata. run = v1 −a→1 v2 −a→2 v3 . . . players optimize
reward per transition (mean-payoff games).
Continuous-time average payoff games
2-player perfect information, zero-sum games
on priced timed automata
run = v1 −a→1 v2 −a→2 v3 . . .
t1
t2
players optimize :
payoff per transition (average price game)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4 v2
Max Min
{x} 2
y<1
x>1 a
b {y} a
b
v1 1
b {y} 3 v3
a {x}
x>1 b
a y<1 4
v4
x
y
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time and continuous-time)
Discrete-time average payoff games
2-player perfect information, zero-sum games on weighted finite automata. run = v1 −a→1 v2 −a→2 v3 . . . players optimize
reward per transition (mean-payoff games).
Continuous-time average payoff games
2-player perfect information, zero-sum games
on priced timed automata
run = v1 −a→1 v2 −a→2 v3 . . .
t1
t2
players optimize :
payoff per unit time (average price game)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4 v2
Max Min
{x} 2
y<1
x>1 a
b {y} a
b
v1 1
b {y} 3 v3
a {x}
x>1 b
a y<1 4
v4
x
y
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Average payoff games (discrete-time)
Game arena A finite directed graph G = {V, E} A partition of vertices V = VMax ∪ VMin A reward function r : V → Z.
Optimization goals π = v1 −a→1 v2 −a→2 v3 . . . A(π) = limn→∞(1/n) · ∑ni=1 r(vi)
v2
2
a
b
v1
b
a
v3
1
3
a
b
b
a
4
v4
Max Min
Strategies History dependent ΣMax = {V∗ × VMax → V} Positional ΣMax = {VMax → V}
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Value of the game A(v, χ, µ) = limiting average of the game. upper-value A∗(v) = infµ∈ΣMin supχ∈ΣMax A(v, µ, χ), lower value A∗(v) = supχ∈ΣMax infµ∈ΣMin A(v, µ, χ). (minimax) value of the game A∗(v) = A∗(v) = A(v, χ∗, µ∗).
Algorithmic problem Find the optimal strategy for Max and Min. Pseudo-polynomial time algorithm [Zwick and Paterson’95].
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Optimality equations
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Local witness of global optimality gain(v) = value of the game. bias(v) = measure of initial fluctuations.
Optimality equations OE For every max vertex v ∈ VMax,
g(v) = max{g(u) : (v, u) ∈ E}. b(v) = max{r(v) − g(v) + b(u) : (u, v) ∈ E and g(v) = g(u)}.
Positional optimal strategy Solution to OE gives optimal value of the game. Solution to OE provides positional optimal strategy.
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Optimality equations
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Local witness of global optimality gain(v) = value of the game. bias(v) = measure of initial fluctuations.
Optimality equations OE For every min vertex v ∈ VMin,
g(v) = min{g(u) : (v, u) ∈ E}. b(v) = min{r(v) − g(v) + b(u) : (u, v) ∈ E and g(v) = g(u)}.
Positional optimal strategy Solution to OE gives optimal value of the game. Solution to OE provides positional optimal strategy.
Ashutosh Trivedi
Average Payoff Games
Introduction Average payoff games (discrete-time) Average payoff games (continuous-time)
Conclusion Appendix
Average payoff games (discrete-time) Optimality equations Strategy improvement algorithm
Optimality equations – Contd.
Optimality equations
If solution for OE exists, then optimal-cycle(v) = Cycle reachable from vertex v, when players play optimal strategies. gain(v) = average of the optimal-cycle(v). bias(v) = optimal distance (fluctuations) from v to an arbitrary vertex in optimal-cycle(v).
g1 b1
v1 r1
g2 b2
v2 r2
g3 b3
v3 r3
g4 b4
v4 r4
g1 = g2 = g3 = g4 b1 = r1 − g + b2
=g
b2 = r2 − g + b3 b3 = r3 − g + b4 b4 = r4 − g + b2
Ashutosh Trivedi
Average Payoff Games
Categories
You my also like
Naval Postgraduate School Monterey, California November, 2006
391.6 KB2.8K1KChapter 10: Extensive Games With Imperfect Information
204.4 KB18.9K3.8KGame Theory: The Mathematics of Competition
823 KB16.3K6.4KThe Case for Research in Game Engine Architecture
71 KB65.3K22.9KChemical Appendix To The Harmonized Tariff Schedule 1/
1.1 MB45.6K22.8KAppendix And Cecum
2 MB6.2K3KiSnake Multiplayer Intelligent Snake Game
1.2 MB7K1.9KReal Options Game Models: A Review
951.2 KB19K4.2KOnline MindReader Game Utilizing Weighted Hedging Trees
3.3 MB3.6K1.7K