Complex dynamics in the Sherrington-Kirkpatrick game
Pith reviewed 2026-07-03 01:49 UTC · model grok-4.3
The pith
Adaptive learning in large random two-strategy games frequently fails to converge to a stable outcome.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the Sherrington-Kirkpatrick game with general random bias the stability of adaptive learning is governed by the memory-loss rate and competitiveness; random fields alter the character of stable states. The grand-canonical version, allowing abstention, possesses its own distinct stability boundaries. The mapping establishes that convergence to a unique fixed point occurs only inside limited parameter regimes, while outside those regimes the dynamics either support many fixed points or remain persistently volatile.
What carries the argument
The mapping of the players' adaptive learning rule onto the Sherrington-Kirkpatrick spin-glass model, which converts stability questions into properties of random interactions and fields.
If this is right
- Low memory loss combined with high competitiveness produces persistently volatile dynamics.
- Random bias fields change the nature of any stable states that do exist.
- Allowing abstention in the grand-canonical version shifts the stability boundaries.
- Two-action games among many players remain unlearnable over wide parameter ranges.
Where Pith is reading between the lines
- The same parameter dependence may govern whether simple reinforcement learning converges in other large multi-agent systems with fixed but random payoffs.
- Varying the memory-loss rate could serve as a control knob to steer collective learning toward or away from volatility.
- Finite-size corrections to the spin-glass mapping would be needed before the predictions can be tested in moderate-sized laboratory or simulation settings.
Load-bearing premise
Payoff matrices are generated randomly once and then kept fixed while players follow an adaptive learning rule whose stability can be read off from the spin-glass mapping.
What would settle it
A direct numerical simulation of the learning process for large but finite player number that reaches a unique fixed point at parameter values where the spin-glass analysis predicts persistent volatility.
Figures
read the original abstract
We study the outcome of adaptive learning of a large number of players engaging in sets of two-strategy two-player games. We are interested in typical games, and generate the payoff matrices at random at the beginning. The payoff matrices then remain fixed during the learning process. This provides a game theoretic foundation for the Sherrington-Kirkpatrick (SK) game, recently introduced by Garnier-Brun, Benzaquen and Bouchaud. The original model by these authors is a special case, with no bias towards any strategy. We here determine stability of learning for SK games with general random bias, and find that the nature of the stable state is affected by random fields. We also introduce a grand-canonical version of the SK game, in which players can choose to abstain. We determine the stability of learning for this game. Our analysis confirms that complex situations involving many players are frequently unlearnable, even if each player only chooses between two different actions. The rate with which players lose memory of past payoffs and the competitiveness of the game emerge as key parameters determining whether learning converges to a unique fixed point, whether there are many fixed points, or if the dynamics remains persistently volatile.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper provides a game-theoretic foundation for the Sherrington-Kirkpatrick (SK) model by considering large numbers of players in two-strategy two-player games with randomly generated, fixed payoff matrices. It extends the original SK game (zero bias) to include general random bias, determines the stability of learning under an adaptive rule via SK spin-glass mapping, and introduces a grand-canonical variant allowing abstention. The central claim is that complex multi-player situations are frequently unlearnable, with the memory-loss rate and game competitiveness controlling whether dynamics converge to a unique fixed point, multiple fixed points, or remain persistently volatile.
Significance. If the asserted mapping from adaptive dynamics to the SK Hamiltonian is rigorously derived and verified, the work would connect game learning to established spin-glass phase structure, offering a parameter-controlled picture of unlearnability in high-dimensional games. The generalizations to biased payoffs and the grand-canonical ensemble add breadth. The manuscript does not report machine-checked proofs or reproducible code, but the parameter-free character of the SK phase diagram (once mapped) would be a strength if established.
major comments (3)
- [Abstract] Abstract: the claim that 'stability of learning ... can be analyzed via the SK spin-glass mapping' is asserted without any derivation showing how the (unspecified) adaptive learning rule produces an effective Hamiltonian or free-energy landscape identical to the SK model; this mapping is load-bearing for importing the known SK phase structure that underpins the unlearnability conclusion.
- [Abstract] Abstract: the learning rule is described only as 'unspecified adaptive learning rule' whose stability is analyzed via the mapping, yet the reported dependence on memory rate and competitiveness is presented as emerging from the model; without the explicit update rule or the first-principles mapping, it is impossible to confirm that the correspondence holds or that the stability diagram is correctly imported.
- [Abstract] The weakest assumption noted in the reader report—that payoff matrices are drawn once and fixed while players follow an unspecified rule—directly affects the central claim; if the rule contains non-gradient or non-mean-field terms, the SK correspondence (and therefore the statements about unique vs. multiple fixed points vs. volatility) can fail while the qualitative narrative remains intact.
Simulated Author's Rebuttal
We thank the referee for the constructive report highlighting issues of clarity in the abstract. The full manuscript contains the explicit learning rule and the first-principles derivation of the SK mapping (Sections 2–3), but we agree the abstract is too terse. We will revise the abstract to specify the rule, note the derivation, and indicate how the parameters control the phase diagram. We address each comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'stability of learning ... can be analyzed via the SK spin-glass mapping' is asserted without any derivation showing how the (unspecified) adaptive learning rule produces an effective Hamiltonian or free-energy landscape identical to the SK model; this mapping is load-bearing for importing the known SK phase structure that underpins the unlearnability conclusion.
Authors: The manuscript derives the mapping explicitly: the adaptive update (exponential moving average of payoffs with rate γ, followed by a smoothed best-response with inverse temperature β) yields an effective potential whose stationary points coincide with the SK Hamiltonian minima. This is shown by direct substitution in Section 2, recovering the standard SK free-energy functional. The abstract will be revised to state that the mapping is derived from the update rule and that the stability diagram follows from the known SK phases. revision: yes
-
Referee: [Abstract] Abstract: the learning rule is described only as 'unspecified adaptive learning rule' whose stability is analyzed via the mapping, yet the reported dependence on memory rate and competitiveness is presented as emerging from the model; without the explicit update rule or the first-principles mapping, it is impossible to confirm that the correspondence holds or that the stability diagram is correctly imported.
Authors: The abstract does not name the rule for brevity, but the manuscript defines it as the standard memory-loss adaptive dynamics (payoff averaging at rate γ, strategy update proportional to payoff difference with competitiveness β). The dependence on γ and eta emerges because these rescale the SK couplings and fields after the mapping. We will revise the abstract to name the rule and state that the mapping is obtained by direct substitution of the update into the potential. revision: yes
-
Referee: [Abstract] The weakest assumption noted in the reader report—that payoff matrices are drawn once and fixed while players follow an unspecified rule—directly affects the central claim; if the rule contains non-gradient or non-mean-field terms, the SK correspondence (and therefore the statements about unique vs. multiple fixed points vs. volatility) can fail while the qualitative narrative remains intact.
Authors: The fixed random payoffs are an explicit modeling choice stated in the introduction and methods. The rule employed is a mean-field gradient flow on the expected-payoff potential; no non-gradient or higher-order interaction terms are present. Consequently the SK correspondence holds rigorously within the stated assumptions, as verified by the explicit substitution in the derivations. We will add a sentence in the revised abstract confirming that the dynamics remain gradient-like. revision: partial
Circularity Check
No significant circularity; SK mapping and stability conclusions imported from independent prior work
full rationale
The paper extends the SK game from Garnier-Brun, Benzaquen and Bouchaud (different authors) by adding general random bias and a grand-canonical version, then analyzes stability via the established SK spin-glass mapping. No equations, fitted parameters, or self-definitional reductions appear in the abstract or description; the reported dependence on memory-loss rate and competitiveness is presented as an outcome of the model analysis rather than input by construction. The central premise relies on an external citation whose validity is not reduced to a self-citation chain or ansatz smuggled from the present authors. This is a standard case of building on independent prior results, warranting a low score.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Garnier-Brun et al [16] find broadly similar phases, additionally they also investigated cyclic or quasi-cyclic behaviour in the SK game
found regimes for two-player games with many actions in which learning converges to a unique sta- ble fixed point (albeit not a Nash equilibrium), phases in which there are many stable fixed points, and a regime in which learning remains chaotic and unpre- dictable. Garnier-Brun et al [16] find broadly similar phases, additionally they also investigated c...
-
[2]
are restricted to that special case, and allowing for non-zero random fields changes the behaviour and phase diagram. We also introduce a version of the SK game in which some players can choose whether or not to participate in the game at any one time step, but where other players must participate at each step. Thus, the number of active players varies in...
-
[3]
This setup that leads to the dynamics as described in [16]
Single-matrix game, no external field We first focus on games where there is no back- ground advantage to any action. This setup that leads to the dynamics as described in [16]. The background bias vanishes whenA+ ij =−A − ij for alli, j, and we will assume this is the case throughout this subsection. We simply writeA ij for this quantity. We then have th...
-
[4]
Bi-matrix version, random fields We can also consider scenarios in which we do not imposeA + ij =−A − ij in Eq. (8). In this situation there can be an intrinsic bias to one of the two actions of playeriin the interaction withj. The quantity Bij = (A + ij +A − ij)/2 quantifies this bias (with pos- itive values representing a bias towards action +1). Writin...
-
[5]
(21)] is well defined, and will generally be non-zero
Nonetheless, the object∂m/∂η[withm(η) the solution of Eq. (21)] is well defined, and will generally be non-zero. If there is no external field (∆ = 0), the stability condition Eq. (22) simplifies toσ <2λ−Γσ 2χ. One can then show (see again the appendix) that the onset of instability occurs whenλtakes the value λcrit = (1 + Γ)σ/2,(23) with instability forλ...
-
[6]
The simulations run for 20,000 steps with measure- ments taken from 10,000 steps onwards. spect tos, we find ∂F ∂s =σ 2 "* ∂x ∂η 2+ +s ∂ ∂s * ∂x ∂η 2+# .(36) The first term in the square bracket captures a direct effect of increasing the number of speculators, assum- ing the behaviour of all players remains unchanged. It indicates how strongly the activit...
-
[7]
Generating-functional calculation The generating function associated with the SK game with a field from Eq. (14) is given by Z(ψ) = Z DmD ˆmexp " i X i Z dtˆmi(t) ˙mi(t) 1−m i(t)2 +λln 1 +m i 1−m i # ×exp i X i Z dtˆmi(t) X j Aijmj(t) +B ij exp i X i Z dtmi(t)ψi(t) ! (C1) 17 Taking the disorder-average in Eq. (D1), we have: exp i X i Z...
-
[8]
We follow the procedure of [15]
Linear stability analysis for SK game with random field For clarity we denote quantities evaluated at the fixed point with a superscript⋆ in this appendix. We follow the procedure of [15]. Linearising about the fixed point involves introducing small perturbations aboutm ∗ andη ∗, m→m ⋆ + ˆm, η→η ⋆ + ˆη.(C9) We obtain the linearised equations about the fix...
-
[9]
Substituting this into Eq
Case without random field When there is no external field (∆ = 0), the fixed point distribution becomes concentrated atm ⋆ = 0. Substituting this into Eq. (C15) the stability condition becomes σ 1 2λ−Γσ 2χ <1,(C16) which simplifies to 1<2λ/σ−Γσχ.(C17) We will now showχ= 1/σat the stability boundary. We note the following relation, valid at the fixed point...
-
[10]
This circumvents the DMFT analysis
Approach based on eigenvalues of random matrices One can also obtain the stability boundary for the SK-game without random field using an approach based on the spectra of random matrices. This circumvents the DMFT analysis. The time evolution of the{m i}is given by d dt mi = (1−m 2 i ) X j Aijmj −λln 1 +m i 1−m i .(C22) We now introduce the followi...
-
[11]
14: Plots of the eigenvalues of Jacobian corresponding toy i at the fixed point for two identical SK games for different values ofλ
Dynamic mean-field theory The generating functional associated with grand-canonical SK game is given by: Z(ψ) = Z DxD ˆxexp " i X i Z dtˆxi(t) ˙xi(t) xi(t)(1−x i(t)) +λln xi 1−x i # ×exp i X i Z dtˆxi(t) N sX j Aijxj(t) +h i ×exp i X i Z dtxi(t)ψi(t) ! (D1) Averaging over the disorder we find exp i X i Z dtˆxi(t) N sX j Aijxj(t) +h i...
2000
-
[12]
* ∂x⋆ ∂η ⋆ 2+ +s ∂ ∂s * ∂x⋆ ∂η ⋆ 2+# (D9) The second term is proportional tos, and can be neglected for sufficiently smalls. We then have ∂F(s) ∂s s→0 ≈σ 2
Linear stability analysis for the grand-canonical game We again carry out a linear stability analysis to obtain the stability criterion for the grand-canonical SK game. We find that fixed points are stable when σ2s * x⋆(1−x ⋆) λ−x ⋆(1−x ⋆)sΓσ2χ 2+ =σ 2s * ∂x⋆ ∂η ⋆ 2+ <1.(D7) As shown in Fig. 10, when Γ<0, increasing the fraction of speculators can destabi...
-
[13]
von Neumann and O
J. von Neumann and O. Morgenstern,Theory of Games and Economic Behavior(Princeton University Press, Princeton, NJ, 1944)
1944
-
[14]
Hauert and G
C. Hauert and G. Szab´ o, American Journal of Physics73, 405 (2005)
2005
-
[15]
Traulsen and C
A. Traulsen and C. Hauert, Reviews of Nonlinear Dynamics and Complexity2, 25 (2009)
2009
-
[16]
J. F. Nash, Proceedings of the National Academy of Sciences36, 48 (1950)
1950
-
[17]
Berg and A
J. Berg and A. Engel, Phys. Rev. Lett.81, 4999 (1998)
1998
-
[18]
Berg and M
J. Berg and M. Weigt, Europhysics Letters48, 129 (1999)
1999
-
[19]
McLennan and J
A. McLennan and J. Berg, Games and Economic Behavior51, 264 (2005), special Issue in Honor of Richard D. McKelvey
2005
-
[20]
Daskalakis, P
C. Daskalakis, P. W. Goldberg, and C. H. Papadimitriou, Commun. ACM52, 89–97 (2009)
2009
-
[21]
Y. Sato, E. Akiyama, and J. D. Farmer, Proceedings of the National Academy of Sciences99, 4748 (2002)
2002
-
[22]
Sato and J
Y. Sato and J. P. Crutchfield, Phys. Rev. E67, 015206(R) (2003)
2003
-
[23]
Y. Sato, E. Akiyama, and J. P. Crutchfield, Physica D: Nonlinear Phenomena210, 21 (2005)
2005
-
[24]
Galla and J
T. Galla and J. D. Farmer, Proceedings of the National Academy of Sciences110, 1232 (2013)
2013
-
[25]
J. B. Sanders, J. D. Farmer, and T. Galla, Scientific Reports8, 4902 (2018)
2018
-
[26]
Pangallo, T
M. Pangallo, T. Heinrich, and J. Doyne Farmer, Science Advances5, eaat1328 (2019)
2019
-
[27]
Opper and S
M. Opper and S. Diederich, Physical Review Letters69, 1616 (1992)
1992
-
[28]
Garnier-Brun, M
J. Garnier-Brun, M. Benzaquen, and J.-P. Bouchaud, Physical Review X14, 021039 (2024)
2024
-
[29]
Sherrington and S
D. Sherrington and S. Kirkpatrick, Physical Review Letters35, 1792 (1975)
1975
-
[30]
P. C. Menezes and D. Sherrington, Journal of Physics A: Mathematical and Theoretical46, 505004 (2013)
2013
-
[31]
R. M. May, Nature238, 413 (1972)
1972
-
[32]
Challet, A
D. Challet, A. De Martino, M. Marsili, and I. Perez Castillo, Journal of Statistical Mechanics: Theory and Experiment2006, P03004 (2006)
2006
-
[33]
Camerer and T.-H
C. Camerer and T.-H. Hua, Econometrica67, 827 (1999)
1999
-
[34]
Traulsen, T
A. Traulsen, T. R¨ ohl, and H. G. Schuster, Phys. Rev. Lett.93, 028701 (2004)
2004
-
[35]
De Dominicis, Phys
C. De Dominicis, Phys. Rev. B18, 4913 (1978)
1978
-
[36]
A. C. C. Coolen,The Mathematical Theory of Minority Games: Statistical Mechanics of Interacting Agents(Oxford University Press, Oxford UK, 2005)
2005
-
[37]
Galla, arXiv preprint arXiv:2405.14289 (2024)
T. Galla, arXiv preprint arXiv:2405.14289 (2024)
-
[38]
Bunin, Phys
G. Bunin, Phys. Rev. E95, 042414 (2017)
2017
-
[39]
F. Roy, G. Biroli, G. Bunin, and C. Cammarota, Journal of Physics A: Mathematical and Theoretical52, 484001 (2019)
2019
-
[40]
Sidhom and T
L. Sidhom and T. Galla, Phys. Rev. E101, 032101 (2020)
2020
-
[41]
Challet, M
D. Challet, M. Marsili, and Y.-C. Zhang, Physica A: Statistical Mechanics and its Applications276, 284 (2000). 24
2000
-
[42]
Mandelbrot et al., Journal of Business36, 394 (1963)
B. Mandelbrot et al., Journal of Business36, 394 (1963)
1963
-
[43]
R. N. Mantegna and H. E. Stanley,Introduction to econophysics: correlations and complexity in finance(Cambridge University Press, 1999)
1999
-
[44]
Rieger, Journal of Physics A: Mathematical and General22, 3447 (1989)
H. Rieger, Journal of Physics A: Mathematical and General22, 3447 (1989). [33]Data and codes:,https://github.com/Desmondccw/complex_dynamics_in_sk_game
1989
-
[45]
O’Rourke, D
S. O’Rourke, D. Renfrew, A. Soshnikov, and V. Vu, Journal of Statistical Physics160, 89–119 (2015)
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.