Cockroaches, Deep Learning and the Fallacy of Veridicality

Breaking the curse of overexploitation

Oded Berger-Tal et al., 2015, The Exploration-Exploitation Dilemma: A Multidisciplinary Framework.

When we look at the current stage of Earth’s exploitation by humans, we intuitively realise that the above observation of Oded Berger-Tal et al. can be relevant to humanity as a whole. We can also estimate the the current phase of the lifespan of our civilization based on the balance of its exploration-exploitation activities. The belief in veridicality of our model of the world is itself a clear indication of the knowledge exploitation phase. The world is constantly changing. We can’t catch up with it without constant exploration. Learning investment of 0 can be justified by the belief that we already know the truth. As soon as we start believing we know the truth, the truth escapes from us.

Given the recent advances and broadening acceptance of supervised deep learning it is easy to conclude that deep learning is bringing the end of our civilization even closer as it scales up exploitation of the old knowledge. Although, it is less evident but deep learning can, in fact, be our only hope because it’s created by nature predominantly for exploration and the establishment of new knowledge.

In this essay I share findings of several teams of researchers suggesting that deep neural networks (DNNs) or similar ‘lattices of receptors’ placed in between of stimulus and response may represent one of the most universal and important building blocks of life that supports the entire knowledge establishment process.

I’ve also put together some examples which suggest that nonlinear dynamics which under certain conditions spontaneously emerge in deep networks (of either neurons or receptors) may represent the most fundamental mechanism of real life learning in real time due to their extreme sensitivity to initial conditions which provide for the most rapid and vast increase of information.

At last, I’ve collected some examples of frameworks, mathematical models and engineering approaches which can help to elaborate a solution that will refocus deep learning from exploitation to exploration. It may mark the emergence of an entirely new generation of AI systems capable of dynamic real life learning in real time.

The exploration-exploitation dilemma

After giving credit to the reinforcement learning (RL) solutions “based on a Bayesian modeling approach where the agent’s decisions are the product of a weighted average of some prior knowledge regarding the environment and current sampling information, and the agent’s need to explore is directly based on its perception of the environment, growing whenever the environment changes… due to the fact that uncertainty should promote exploration in an attempt to reduce it, and indeed there is evidence that surprising events and changes to the environment promote animals to learn faster,” authors conclude that RL solutions “are also very mechanistic in nature and are, in many cases, specifically tailored to solve certain tasks, such as passing through mazes, with no attention given to the general motivation and ecological background of the subject. In other words, the above mentioned models have concentrated on the how rather than on the why of the decision-making process.”

Their model “depicts a subject that can invest in energy acquisition (exploitation) or knowledge acquisition (exploration), according to a strategy that represents the proportion of time the subject invests in knowledge acquisition as a function of time along its lifetime.” While they “focus on the optimal exploration-exploitation strategies at different stages of a subject’s life-span” we propose to explore if a similar dynamical model can be applied to much shorter periods of time.

Let’s, now, see how bacteria, cockroaches and worms establish knowledge.

How do cockroaches explore?

“What is remarkable about the cockroach is not simply that it has survived so long but that it has done so with a singularly simple and seemingly suboptimal mechanism: It moves in the opposite direction of gusts of wind that might signal an approaching predator. This “risk management structure” is extremely coarse; it ignores a wide set of information about the environment — visual and olfactory cues, for example — which one would think an optimal risk-management system would take into account,” a Business Insider’s article states.

This example raises a simple question: what is optimal from the evolutionary point of view? The fittest survive, indeed, but do the fittest optimise their perceptions to have the most accurate model of the environment or do they optimise their policies to achieve the most desired outcomes?

First, single out a vital parameter

Cockroaches are not the only champions in sensing a particular vital cue. “Cyanobacteria in the oceans are among the world’s most important oxygen producers and carbon dioxide consumers. Synechocystis is a spherical single-celled cyanobacteria” that is for over a century known for its ability to move towards light. But the method of how a tiny bacteria can sense where to move remained unclear until Nils Schuergers et al. discovered that “Synechocystis cells do not respond to a spatiotemporal gradient in light intensity, but rather they directly and accurately sense the position of a light source.”

“We show,” they explain in their paper, “that directional light sensing is possible because Synechocystis cells act as spherical microlenses, allowing the cell to see a light source and move towards it. A high-resolution image of the light source is focused on the edge of the cell opposite to the source, triggering movement away from the focused spot. Spherical cyanobacteria are probably the world’s smallest and oldest example of a camera eye.”

Bacteria, of course, doesn’t recognise the high-resolution image on its edge. It only reacts to the spot of light inferring the direction in which to move from the spot’s location. A very elegant solution, isn’t it?

Utilitarian model of the world instead of objective model

Everything that we need in the phase of knowledge exploitation is an interface, indeed. Cockroaches should run using this interface straight in the direction opposite to the wind. They would all be dead by now if they did so, however. Single trajectory is too easy to predict for a predator. Cockroaches still live because they can behave unpredictably. They use utilitarian model of the world optimised by policies instead of objective model optimised by accuracy.

Hypersensitivity to vital parameters

Bacteria senses a very small change in the environment. It detects the gradient of the change very rapidly. It upscales a tiny chemical impuls into a “chain reaction” that results in a much stronger action: to move in the direction of gradient increase of a detected favorable stimulus or in the direction of gradient decrease of an unfavorable stimulus. Normally is moves around in very short strolls. When it detects the gradient it accelerates for longer periods of time in the desired direction.

Scientists from Cornell University have discovered long ago that bacteria creates a lattice of receptors at its surface to increase sensitivity and to amplify the signal. The description of the way how the lattice works reminds me very much the description of how nonlinear dynamics emerge in a deep neural network with randomly assigned initial connection weights.

Vast variety of responses to tiny changes in vital parameters

May such deviations from optimal policy have an evolutionary rationale as well?

In fact, they have. “In cockroaches, wind evokes strong terrestrial escape responses in Periplaneta americana and Blattella germanica, but only weak escape responses in Blaberus craniifer and no escape responses in Gromphadorhina portentosa,” Claire A. McGorry et al. state in their paper. Their research proved that all four cockroach species possess wind-sensitive interneurons which provide input to the premotor/motor neurons of insects irrespectively of their behavioral response to wind. Hence the reason for different policies is not anatomical. Does it mean that different species have different response strategies to a threat or do they classify wind gusts differently as big, moderate or non-existent threat?

Anyway, the variety of cockroaches’ responses to a single stimulus is huge. What for does this variety exist? Let’s hypothesize.

According to Ashby’s law of requisite variety only “variety can destroy variety.” It means that an organism can survive in an environment only if it can have an equal or wider repertoire of responses to the repertoire of environmental challenges. Complexity of challenges require complexity of responses.

Motion picture, not a still-life

Chaos — learning emerges from sensitivity to initial conditions

“In a standard RNN-model, there is a constant link between neuron one and neuron two, defining how strongly the activity of neuron one influences the activity of neuron two”, says Ramin Hasani. “In our novel RNN architecture, this link is a nonlinear function of time.”

Deep Temporal Models and Active Inference

“The deep temporal aspect of these models means that evidence is accumulated over nested time scales, enabling inferences about narratives (i.e., temporal scenes). We illustrate this behaviour with Bayesian belief updating — and neuronal process theories — to simulate the epistemic foraging seen in reading. These simulations reproduce perisaccadic delay period activity and local field potentials seen empirically.”

The Anatomy of Inference: Generative Models and Brain Structure

“Generative models that evolve continuous time or discrete time likely coexist in the brain, mirroring the processes generating sensory data. While, at the level of sensory receptors, data arrive in continuous time, they may be generated in a sequential, categorical manner at a deeper level of hierarchical structure. For example, a continuous model may be necessary for low level auditory processing, but language processing depends upon being able to infer discrete sequences of words (which may themselves make up discrete phrases or sentences).”

Active inference, communication and hermeneutics

“In our previous paper, we focused on the dynamical phenomena that emerge when two dynamical systems try to predict each other. Mathematically, this dynamical coupling is called generalised synchrony (aka synchronisation of chaos).”

Discrete Sequential Information Coding: Heteroclinic Cognitive Dynamics

“The hierarchical sequential segmentation of information into discrete events — patterns — is a fundamental intrinsic feature of brain dynamics. This concept has been used to design top-down explanations for brain activity on the view that the brain infers causes of its sensory input (Kiebel et al., 2009; Friston et al., 2011). In this setting, hierarchical sequential dynamics in general — and stable heteroclinic channels in particular — have been used as the basis of generative models for the Bayesian brain. We discuss here an adequate mathematical approach that is applicable for the description and prediction of consciousness, emotion, and human behavioral activity.”

“We would like to end with a remark on the popular view that brain computational models need to be extremely high dimensional to be predictive. This view is based on the fallacy that computational dimension is related to the complexity of the brain itself as a “hardware” system with different interacting spatial scales from which cognition emerge. Such modeling is unfeasible yet, as the brain remains only partially observable. However, we may not need it to explain key aspects of cognitive processes because we are talking about mind dynamics with finite resources, i.e., specific kinds of brain activity such as attention, memory retrieval, decision making, etc. A top-down mathematical model of such processes can be built using the following dynamical principles that we discussed above: (i) clusterization the neural activity in space and time and formation of information patterns; (ii) discrete sequential information coding; (iii) robust sequential coordinated dynamics based on heteroclinic chains of metastable clusters; and (iv) sensitivity of such sequential dynamics to intrinsic and external informational signals. These principles open a new direction for the understanding of the observed brain dynamics and the creation of the basis of a mathematical theory of consciousness.”

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

“…we propose a model-based RL method based on learning an approximate, factorized transition model. The approximate transition model involves discrete, abstract states acting as information bottlenecks, which mediate the transitions between successive full states. Once learned, the approximate transition model is then applied to learn the agent’s policy (for example, using Q-learning with rollout simulations). This method has several advantages. First, the factorized model has significantly fewer parameters compared to a non-factorized transition model, making it highly sample efficient. Second, by learning the abstract state representation with the specific goal of obtaining an optimal policy (as opposed to maximizing the transition model’s predictive accuracy), it may be possible to trade-off some of the transition model’s predictive power for an improvement in the policy’s performance. By grouping similar states together into the same discrete, abstract state, it may be possible to improve the performance of the policy learned with the approximate transition model.”

Mind-to-mind heteroclinic coordination: model of sequential episodic memory initiation

“… we present and study a low-dimensional model of mind-to-mind episodic memory interaction. We emphasize from the beginning that we intend not to model the brain itself as a system but to create a dynamical model for the activity of this system. Our ultimate goal is to describe, understand and make predictions of mind dynamics, obtaining, in particular, dynamical models of specific classes of such activities as cognition, creativity, and autobiographic memory.”

Developing Concepts with Children Who Are Deaf-Blind

“Each deaf-blind child develops their own unique concepts based on their personal experiences. Here are some ideas that make sense from the perspective of the deaf-blind people who had them, but that might seem “odd” to someone with sight and hearing:

  • a boy thought “going home” meant the feel of a bumpy road and a series of turns in the car
  • a boy experiencing snow for the first time thought it was ice cream and asked for chocolate
  • a girl touched a wet leaf and signed “cry” (it felt like tears)
  • a girl thought food came from a mysterious place up high (it was always set down on the table from above)
  • a young man didn’t know, even after many years, that his family’s pet cat ate (he had never seen it or touched it as it ate, and no one had ever told him)”

“A deaf-blind child will have difficulty developing accurate ideas about the world unless she has at least one trusting, significant, meaningful relationship to serve as a center from which to explore the world in gradually widening circles. The process of developing concepts is a shared adventure between a child and the child’s communication partners. It involves the co-creation of meaning. The child does not make meaning by herself; she and her communication partners make meaning together (Nafstad & Rodbroe, 1999).”

“The development of concepts is a shared adventure, one in which you and a child who is deaf-blind can learn from each other and explore the world together. Concepts are dynamic and continually developing. This is true for everyone, regardless of whether or not we can see and hear. You may never have thought about the rope-like quality of an elephant’s tail, about the way that rain is like tears, about the unique texture of a wall and how it feels like a stone that is near the back porch, or about how the wind feels on your face. A deaf-blind child can show you new concepts like these and new ways of experiencing the world. You can help her understand that she can be a real participant in an enjoyable social world. You can show her that other people use their body language or sign language to communicate. You can tell her that you like cherries and the feel of the dog next door and playing a hand drum. You can show her that the toy elephant also has big floppy ears and a snaky trunk. It is through shared experiences that concepts grow. It is together that we learn more and more about each other and about the world around us.”

Curious about life and intelligence

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store