Rock-Paper-Scissors

Yuri Barzov
6 min readMay 1, 2021
Photo by Ketut Subiyanto from Pexels

A mathematical model of just one neuron can predict earthquake aftershocks using only two parameters and logistic regression (the same one used in the Lotka-Volterra system of equations) with the same or higher accuracy as a deep neural network of six hidden layers of 50 neurons in each, processing over 13 thousand parameters.

The story about this made a lot of noise in narrow circles in 2018–19. Since then, I have been constantly asking myself the question: do modern versions of artificial neural networks have redundant functionality, which makes the basic principle of their work incomprehensible?

It seems to me that a simple ancient game of rock-paper-scissors helps to reveal just such a fundamental principle of learning from chaos.

A player in rock-paper scissors can be represented as a heteroclinic network of three phase states: rock (R), scissors (S) and paper (P). Possible transitions between these states (variants of moves), the network’s edges are indicated by arrows, each of which has its own probability.

Each new move in this game does not depend on the previous one and the graph of the player’s phase states is a classic Markov chain or Crutchfield epsilon machine.

If all moves (phase states) are equally probable, then the player behaves completely randomly. Determining the probabilities of choosing moves is called a strategy.

If there is a second player, then whatever strategy he chooses, he will not be able to beat the first player in a series of attempts. On the contrary, in order not to lose, the second player must adhere to the same strategy as the first.

Thus, the choice of the strategy of equal probability of all moves leads to the achievement of the Nash equilibrium. Named after John Forbes Nash and formulated by him in his doctoral dissertation, the equilibrium just means that in a noncooperative game there are strategies that allow you to avoid losing with a large number of attempts.

As long as we have only one player, he constantly changes his states, in other words, fluctuates or oscillates in a completely random way.

The appearance of the second player does not change anything in his oscillations. Moreover, the second player begins to oscillate as unpredictably as the first.

If win = 1, loss = -1, and draw = 0, then each player will win exactly as much as the other will lose. This is a zero-sum game.

If win = 1, loss = -1, and a draw can give one player the opportunity to win more than another loses, then it will be a non-zero-sum game. This option is possible if the rules state that the payout what players get in case of a draw ranges from 0 to 0.5.

I want to focus on this point especially because Yuzuru Sato and his colleagues focused on the nonzero sum of the game in their work, although, in addition to deviating the result from zero, we also get a great deal of variability using such a step. If we recall the work of Vanchurin and Katsnelson on the emergence of quantum dynamics in an artificial neural network in learning equilibrium, there the authors also added variability to the stochastic (random) dynamics of the hidden layer, on the input of which the output layer of the network was learning.

Now let’s move on to learning. If we base it on the simple principle that the payoff serves as a reinforcement for adjusting the probabilities of choosing moves by each of the players, then the system of coupled replicator equations turns out to be very suitable for mathematical programming of such a learning process.

Here, too, emphasis is required. The replicator equations in game theory are completely equivalent to the Lotka-Volterra equations in evolutionary theory. Moreover, the replicator equations are an adaptation of the Lotka-Volterra equations to game theory.

The emphasis is important because we are dealing here with the same processes of winnerless competition, which, once again, are simply called differently. It is the dual replicator equations that were used by James Crutchfield and Sato to describe the learning process of two players (two heteroclinic networks in phase space) while playing rock-paper-scissors.

The result turned out to be very interesting and painfully familiar. Applying a system of learning equations to a zero-sum game resulted in the players not achieving Nash equilibrium, but creating real deterministic chaos without attractors.

Although the averages for the zero-sum series of games did not differ from the Nash equilibrium (pure randomness), the deviations from the averages in some sections of the game’s course became much stronger. Randomness has become much more varied, so to speak.

With a deviation from zero sum, the dynamics of the game became even more interesting. Let’s give the floor to Sato: “When the zero-sum condition is violated we observe other complicated dynamical behaviors, such as heteroclinic orbits with chaotic transients.” Naturally, we are talking about orbits in phase space. This is the winnerless competition dynamics.

Two neurons can learn from each other, having access to the same source of chaos at the action-perception level, but not at all directly exchanging information with each other. It turns out that chaos becomes a medium (a channel of transmission or a source of information) if you add variety to it.

To figure this out, we needed only two neurons, each of which is a simple network of three states in phase space, as well as assigning dynamics to them using a simple system of equations with nonlinear dynamics — equations indicating the presence of life, according to Eugene Wigner. I will now call them equations of learning.

Curiously, dynamics similar to the wave function in quantum physics can, under certain circumstances, arise in the evolving Lotka-Volterra system when it reaches stability.

A work with an example of such system behavior was published in 2016. Djordje Minic of the Virginia Tech University and Sinisa Pajevic of the US National Institutes of Health suggested that in order to explain the effect of the formation of stability in nonlinear processes in living nature, it is necessary to create a theory of quantum-similar processes.

After all, the emergence of stability in complex adaptive systems (such as Lotka-Volterra) does not at all correspond to the classical theories of stability, but fits perfectly into the Schrödinger equation with the “Planck constant”, which in this case has a different value than in quantum physics and depends on the system.

Therefore, the authors of the work emphasize that the theory proposed by them describes a new type of emergent stability arising in biological systems, and they do not try to assert that classical quantum physics can be applied at the macro level.

Here, I involuntarily see an analogy with the wave function that arises in the Vanchurin and Katsnelson neural networks when the network reaches equilibrium in learning.

Eugene Wigner gave a very interesting definition of the wave function. The wave function of an object is a mathematical concept that is composed of a countable infinity of numbers containing all possible knowledge concerning an object. “If one knows these numbers, one can foresee the behavior of the object as far as it can be foreseen. More precisely, the wave function permits one to foretell with what probabilities the object will make one or another impression on us either directly, or indirectly.”

So defined the wave function conveys the distribution of meaning without a distinction between quantum (observable) and ordinary (observing) worlds. The border between subjective perception and ordinary objective reality exists but it can be moved arbitrarily to a very great extent because the ordinary world is a virtual reality representing the best try of our intelligence to make sense of our subjective impression of the observation of the wave function.

References:

  1. Mignan, Arnaud (2019), Forecasting aftershocks: Back to square one after a Deep Learning anticlimax, Temblor, http://doi.org/10.32858/temblor.053
  2. Mignan, A., Broccardo, M. (2019) One neuron versus deep learning in aftershock prediction. Nature 574, E1–E3. https://doi.org/10.1038/s41586-019-1582-8
  3. Yuzuru Sato, Eizo Akiyama, J. Doyne Farmer. (2002) Chaos in learning a simple two-person game. Proceedings of the National Academy of Sciences Apr 2002, 99 (7) 4748–4751; DOI: https://doi.org/10.1073/pnas.032086299
  4. Sato, Y., & Crutchfield, J. P. (2003). Coupled replicator equations for the dynamics of learning in multiagent systems. Physical review. E, Statistical, nonlinear, and soft matter physics, 67(1 Pt 2), 015206. https://doi.org/10.1103/PhysRevE.67.015206
  5. John F. Nash (1950) Equilibrium points in n-person games. Proceedings of the National Academy of Sciences Jan 1950, 36 (1) 48–49; https://doi.org/10.1073/pnas.36.1.48
  6. Mikhail I. Katsnelson, Vitaly Vanchurin (Dec. 2020) Emergent Quantumness in Neural Networks. arXiv:2012.05082 [quant-ph]
  7. Hofbauer, J., & Sigmund, K. (1998). Evolutionary Games and Population Dynamics. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139173179
  8. Djordje Minic, Sinisa Pajevic (2016) Emergent “Quantum” Theory in Complex Adaptive Systems. Mod Phys Lett B. 2016 Apr 30; 30(11): 1650201. Published online 2016 Mar 29. doi: 10.1142/S0217984916502018
  9. Wigner E.P. (1995) Remarks on the Mind-Body Question. In: Mehra J. (eds) Philosophical Reflections and Syntheses. The Collected Works of Eugene Paul Wigner (Part B Historical, Philosophical, and Socio-Political Papers), vol B / 6. Springer, Berlin, Heidelberg.

--

--