Mom, I Need to Learn from My Prediction Errors

Yuri Barzov
7 min readJul 23, 2018

Machine learning was inspired by neuroscience. The current revolution in neuroscience is inspired by machine learning and uses machine learning metaphors and algorithms to advance in the understanding of how our brain works.

A new wave in machine learning that raises in the area of artificial general intelligence (AGI) now gets inspiration from the branch of neuroscience that was inspired by AI.

Reverse engineering of the human brain on the basis of machine learning may seem the most obvious solution for the AGI. But is it really so? Human brain evolves with experience. It may be the case that the pivotal role of the development of human brain during adolescence may lead to breakthroughs in both AI inspired neuroscience and neuroscience inspired AI.

A Unified Theory of Brain Is Getting Traction

A book The Predictive Mind by a philosopher of mind Jakob Hohwy that came out of print last year explains in plain text the candidate unified theory of brain introduced by the most cited in the World neuroscientist Karl Friston. The brain, he writes, is essentially a hypothesis-testing mechanism, one that attempts to minimise the error of its predictions about the sensory input it receives from the world. The brain, therefore, receives highest rewards not when it makes the right prediction but, on the contrary, when it detects prediction errors because each prediction error increases brain’s knowledge and allows it to make better hypotheses about the environment. Prediction error is a synonym of unexpected uncertainty on which we base our solution.

It turns out that brain is an instrument of searching for truth (God, I always believed it was!) Prediction errors signal the need to explore the environment in order to avoid them in the future. In the exploration mode brain produces knowledge about the environment. It uses produced knowledge in the exploitation mode. Exploitation can not be a purpose but only a mean because it leads to a decay. The efficient exploitation can prolong decay but can’t revert it. Only the emergence can as it resist decay towards higher entropy according to the Second Law of thermodynamics. The exploration leads to the emergence of knowledge (information, truth). It makes life resist decay. That’s why the exploitation must serve the exploration to make life prevail.

New Findings: Exploration Leads to Cooperation, Exploitation leads to Competition

A group of scientists from Harvard recently demonstrated that a link exists between exploration/ exploitation modes of brain activity and cooperation/competition modes of human behavior. They combined evolutionary dynamics of cooperation with stochastic games. It turned out that players began to cooperate only when the volume of resources in their exploitation games started to change probabilistically depending on player’s actions: cooperation resulted in more resources while competition lead to less resources in the future game.

The discovery was that the pretty obvious logic to cooperate for more resources in the future worked only if the amount of available resources changed randomly, hence unexpectedly. By the introduction of uncertainty researcher were switching the players’ brains to the exploration mode. It turned out that the exploration mode fosters cooperation and unites while the exploitation mode leads to competition and divides.

In the exploration mode the brain receives dopamine rewards only for the detection of prediction errors on which it learns to better predict the future and more optimally fit into the environment. There is no reward available for the exploitation mode because the brain treats it as an automated support function (only small awards are available when the brain learns how to exploit). Several research papers supporting this notion are available including the paper of Harvard researchers on the subject of adolescent cognitive development.

The Role of Unexpected Uncertainty in Learning and Exploration Gets Clearer

Children explore the environment using a relatively simple response learning strategy. It enables them to detect sufficiently long linear dependencies between stimuli and responses. At the same time brains of even eight month old toddlers reacts with increased activity to prediction errors, for instance, to unexpected syllables in a random stream of more familiar syllables.

In adolescence the response (linear cues) learning strategy gets supplemented with the spatial (associative) learning strategy that enables teenage brains to encode non linear dependencies between stimuli including more distant in time and space goals and landmarks. In adulthood the two strategies either compete or cooperate only subsequently. When one is engaged, another remains idle. In the adolescent brain both strategies can be engaged simultaneously — in parallel. Only fMRI of a teenage brain can show hippocampus (center of spatial learning) and caudate nucleus (center of response learning) activated at the same time. For this reason teenagers are so good learners. They pay for enhanced learning abilities with bigger exposure to risk.

Teenagers seek unexpected uncertainty (prediction errors) that their brains badly need to stay in the exploration mode, but the world around them, especially in developed countries, is designed by adults predominantly for the exploitation. Both learning strategies are linked to the exploration mode. They deal with probabilities and with stochastic inferences. Exploration (for knowledge, for truth) sees the world as a non stationary stochastic process (or chaos on a stage when it’s indistinguishable from randomness).

Understanding of Stability as Decay Solidifies

The exploitation (of the results of exploration), on contrary, can be almost entirely deterministic, because i occurs in the territory that has been already liberated from uncertainty by the exploring brain (or the civilization-wide brain network). This “territory” has signs on each corner, traffic lights — at every crossing, labels and instructions — attached to all objects. Threats and rewards are strictly determined in this predictable world. The brain that is deprived from unexpected uncertainty in such world suffers from the deficit of dopamine and tries to extract micro doses of dopamine from whatever surrogates possible starting with exploitation activities and ending with substance abuse and extreme.

Initially, surrogate learning based on deterministic casualties hijacks the caudate nucleus by depriving it from uncertainty. Afterwards, it forms habits which enable the organism (and the brain) to function automatically in the exploitation mode. With ageing the exploitation mode totally blocks the exploration mode that it was originally designed to supply with resources, as some brain researchers demonstrate.

Life is powered by the exploration (search for truth) that secures the emergence and better environmental fitness. It slowly leaves the brain and the organism hijacked by the exploitation mode. Slow decay replaces it. The engine has been shut but the car keeps moving by inertia. It’s movement, all right, but it’s only downhill movement. Then full stopping comes.

Jeff Bezos is absolutely right when he says that stasis doesn’t exist. Only degradation does. Death arrives after.

The Tipping Point of Cognitive Development in Adolescence

There is a mounting evidence that this vicious cycle begins in adolescence. Availability of unexpected uncertainty in adolescence shapes the curve of cognitive development/decline for the rest of our lives. Teenagers are crossing the tipping point in their cognitive transformation: they either develop a stable inclination towards exploration or don’t develop it and begin to drift towards the exploitation mode without alternative. Autism, depression, loneliness may be nothing else but signs of the resistance of the exploration seeking adolescent brain against the totalitarian pressure of the adult world of dominating exploitation.

Researchers show that teenage brains receive most of their dopaminergic rewards from prediction errors which drive their heightened reward seeking behavior. Exactly at the very tipping point that lays the foundation for the rest of their life teenagers are desperately seeking only one thing that their environment can’t deliver. The unexpected uncertainty. They take life threatening risks and try substance abuse in pursuit of finding it. Now, when we know the reason, a scientifically salient, compelling and safe way of satisfying their need should be developed.

The situation is unique because right now teenagers explore the new frontiers of connected life and eagerly experiment with all online and virtual novelties in their never ending quest for the unexpected uncertainty. Mostly, what they find is just another ‘satisfaction guaranteed’ type of product or medium.

Now, when we know a theoretical answer the challenge remains to create an app, a game or any other environment that will generate authentic unexpected uncertainty. It is impossible to fool the Bayesian ideal observer in our brains. Therefore the unexpected uncertainty in a virtual world should be real.

Artificial Intelligence May Bring a Solution to Retain Humanness and Cognitive Fitness for People

Recent advancements in the field of Artificial Intelligence promise that it will become possible soon to create artificial agents which will be able to produce sufficient level of unexpected uncertainty.

The emergence of individuality in artificial agents appears to be feasible based on the same approach that neuroscientists use to understand human learning strategies.

Google DeepMind recently introduced Generative Query Network (GQN) that generate predictive models of 3D virtual spaces and render views of those spaces from different perspectives using minimization of prediction error technique. As Quanta Magazine reports, researchers from DeepMind were inspired by the unified brain theory of Karl Friston.

A group of researchers from the Montreal Institute of Learning Algorithms (MILA) lead by the guru of deep learning Yoshua Bengio presented another highly promising solution — the Bottleneck Simulator, an implementation of a model-based deep reinforcement learning approach.

Useful Links

The neuroscience of adolescent decision-making

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4671080/#!po=28.7500

Under the hood of statistical learning: A statistical MMN reflects the magnitude of transitional probabilities in auditory sequences

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4735647/#!po=5.95238

Cognitive flexibility in adolescence: Neural and behavioral mechanisms of reward prediction error processing in adaptive decision making during development

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4330550/#__ffn_sectitle

The Neural Representation of Unexpected Uncertainty During Value-Based Decision Making

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4885745/#!po=6.86275

Gray Matter Differences Correlate with Spontaneous Strategies in a Human Virtual Navigation Task

http://www.jneurosci.org/content/27/38/10078

Conditional Generative Adversarial Nets

https://arxiv.org/abs/1411.1784

Universal Darwinism As a Process of Bayesian Inference

https://www.frontiersin.org/articles/10.3389/fnsys.2016.00049/full

Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis

http://www.pnas.org/content/108/Supplement_3/15647

Social learning through prediction error in the brain

https://www.nature.com/articles/s41539-017-0009-2

A unique adolescent response to reward prediction errors

https://www.nature.com/articles/nn.2558

--

--