Tomcat
Professional
- Messages
- 2,689
- Reaction score
- 973
- Points
- 113

This is a famous painting by the equally famous Belgian artist Rene Magritte. It has a picture of a smoking pipe and the inscription in French reads "This is not a pipe."The last decade has presented us with a number of theories that have compressed the interdisciplinary experience of generations of scientists into understandable constructs. Perception, cognitive distortions, adaptive strategies - all have a common principle.
Comparing the image and the text, you may have experienced, albeit in a slight, invisible form, some mental conflict between the expected and the perceived. The American social psychologist Leon Festinger published a description of this inner "encounter with reality" in 1957, calling it "The Theory of Cognitive Dissonance ."The musical context of the origin of the term (from the Latin dissonantia - discrepancy, disagreement, inconsistency) intuitively suggests the main meaning inherent in this theory: a sharp, false, screaming note invades the smooth and harmonious process of perception of reality, which spoils everything.
Cognitive dissonance is not always just a psychological conflict, which implies some kind of frustration, but rather a whole gradient of sensations of varying degrees of complexity and severity: from stupor and not understanding how to live on, to slight bewilderment, such as from the riddle that the gallant soldier Schweik was shaking perception of forensic doctors:
In the mass consciousness, cognitive dissonance is most often represented only by the psychological conflict itself; the second part of this phenomenon is lost sight of - the mechanism for resolving this conflict, or, if I may say so, reconciling expectations with reality. The phenomenon of cognitive dissonance described by Festinger in the theory of the same name includes not only stress and discomfort from the perception of new information that contradicts existing expectations, but also ways to reduce this dissonance.“There is a four-storey building with eight windows on each floor, two dormer windows and two chimneys on the roof, and two tenants on each floor. Now tell me, gentlemen, in what year did his grandmother die at the doorman's? "
Where do expectations come from?
Remember the dualistic vision scheme: Sensory stimulation activates processes in the brain. Imagine this complex chain of events, starting from a photon striking the light-sensitive cells (rods and cones) of the retina of our eye, and ending with the "assembly" in the higher parts of the brain of a complex visual image included in a certain context. Now scale this to the full amount of visual stimuli available. But this is only sight - one of the "channels" of incoming information about the world. The mind cannot grasp all this monstrous barrage of sensory signals that falls on us in a unit of time. If the perception of living organisms functioned according to outdated views, then life would simply not exist - it is impossible to keep up with the departing train of reality fluctuations. If it is impossible to be in time, then one thing remains - to anticipate.
Imagine a brain locked under the vault of the skull - it sees and hears nothing. He simply receives a stream of signals, being guided by which he must guess what is happening outside. In fact, it is not easy to guess, but to Predict - so that the body has time to prepare and react.
Predictive processing theory
The brain functions as a multilevel prediction machine in which a downward stream of predictions (what we expect from the world) is continually compared and adjusted against an upward stream of sensory data (what our senses perceive). Downstream is everything we know about the world, our best heuristics (quick and simplified inference for the sake of efficiency), our preliminary beliefs and expectations (priors), all our previous experience - from E = mc2 to “London is the capital of Great Britain”. The upward flow, on the other hand, consists of three parts - exteroception (what happens outside the body), interoception (what happens inside the body) and proprioception (position and movement of the body), which are collected in a multimodal model. Thus, all our knowledge becomes the foundation for constructing predictions of how we should feel.

How does this happen
- The brain generates mental models (that's why they are called generative), which predict that the sensory apparatus should receive "at the input" (sensory input). These predictions are called prior beliefs.
- Predictive models are layered on top of each other according to a hierarchy that reflects the organization of the brain, from lower to higher, from simple to complex - the higher levels send predictions down, and the lower ones send incoming sensory data up.
- If the top-down signals do not match the bottom-up signals of the sensory data, a sensory prediction error occurs and the model either updates its priors or ignores the input. like noise and retains presets.

Example
Think about your vision. We never see the world as it is perceived by the retina. First, an inverted image falls on the retina (from the point of view of optics, the eye is a pinhole camera, and your brain inverts the image). Secondly, it is blurred along the periphery due to the uneven distribution of visual cells over the area of the retina. Thirdly, a layer of blood vessels (inverted retina) is still superimposed on top. Fourthly, there is a blind spot at the exit site of the optic nerve. Yes, and also our eyes make many imperceptible and very fast movements, saccades, “Feeling” the space. We enjoy a full-color, three-dimensional, image stabilized in relation to the movement of our eyes and head. Also pre-interpreted. Our brains even predict the play of light and shadow, as in the visual illusion below.

How the brain uses Bayesian statistics
A critical parameter for signals of both streams is the confidence level. That is, we are interested not only in data, but also in their precision (precision) or probabilistic "weight". An upward signal “an elephant is standing in front of you” will have a high probabilistic weight, a shaky silhouette far in the fog - a low probabilistic weight. The downward prediction that the water is likely to be wet is very high weight, “The Dow Jones should drop a couple of points due to the rise in diaper prices” is very low.
Both streams - bottom-up and top-down - interact continuously with each other at each level, and this process of continuous refinement of probabilities can be described using Bayesian statistics. Bayes' theorem is somewhat similar to the riddle joke about sticking a giraffe in the refrigerator. Its essence is in determining the probability of an event occurring based on previous events. Quite exaggerated - if the glass from which you took on the chest yesterday with incomprehensible personalities smells of acetone, then it will be bad in the morning.

The picture shows an example of a graphical representation of Bayesian inference with a Gaussian distribution. In fact, everything here is not as complicated as it seems at first glance. Expectation is our expectations, Reality is obviously reality, and Estimate is our assessment, or perception, a compromise between the first and the second.
On the X-axis, we have any parameter that we are trying to predict, and on the Y-axis, the probability of each value of this parameter. Uncertainty is the variance in expectations, and Noise is the variability in confidence. Now let's put it all together:
- There is a certain expectation (Expectation), it is also a forecast / preliminary data (Prior), whose accuracy depends on the uncertainty (Uncertainty).
- There is sensory input / accuracy (Likelihood) or simply reality (Reality), whose accuracy depends on noise (Noise).
- Between expectations and reality lies what we perceive, Posterior. There were a priori (preliminary) expectations (state DO), we corrected them according to the received signal from reality and received the posterior probability (state AFTER).
An illustrative but ROUGH example
You decided to skip work on the assumption that the director went on a business trip and this would go unnoticed. This is our Prior.
The accuracy of your forecast depends on the degree of uncertainty (Uncertainty) - Did he leave for sure? Has nothing changed? And where does infa come from? And we do not have any rush jobs? The less you know - the higher the uncertainty - the less accurate the forecast.
We begin the process of selecting information, we receive this very sensory input (Likelihood) - we punch it from colleagues, managers, right up to checking the departure of his flight on the airline's website. And here the accuracy of the already incoming data depends on the noise level (Noise) - you heard this in the smoking room (low probabilistic weight, low-precision sensory data), from your project manager, who regularly reports to the director (average probabilistic weight, medium precision sensory data) or it was his assistant secretary who bought him tickets, took him to the airport, put him on the plane and waved a handkerchief after him (high probability weight, high precision sensory data).
What we will be dealing with is the Posterior probability - the average between what we predicted and what we learned. And if the forecasts were accurate enough, and the input data were not too littered with insignificant information, then we all accurately calculated and our unauthorized vacation was successful and went unnoticed. The Prediction error will be small. But if we relied on vague inferences with a high degree of uncertainty (large scatter of values) and the input data were selected randomly and anywhere (large scatter of indicators due to high noise level), then there is a high probability that our prediction failed , the director simply bounced off somewhere on business, our “confidants” were the first to knock on us to the director, a hit, a reprimand, and then everything is like in life.
But you want to take a vacation more than once, right? Therefore, you will analyze your next attempts much more thoroughly. It remains to find out how you will adjust your beliefs so that you will certainly not get caught in the future. Everything is quite simple here.
Look at the graph again - in this example, the reality turned out to be more accurate, and our posterior probability shifted towards it. To put it bluntly, the secretary is a reliable source, but his own speculation is not very good. And we are seriously updating our preliminary belief. But if our inner intuition, some indirect signs and everything that can be attributed to preliminary forecasts, turned out to be more accurate, then our posterior probability would shift to it. Roughly speaking, we ourselves with a mustache, we know better when the director left after all. And then there is no global renewal of beliefs - the previous ones turned out to be quite effective.
Surfing uncertainty
Now that we have dealt with the so-called Bayesian Brain Hypothesis, we will complement and expand our understanding of how predictions interact with input data.
There are three options for the development of events.
First. If the predictions more or less coincide with the incoming sensory data, then “everything is calm in Baghdad”, and at high levels there is generally silence in the air and languid bliss - the predictions come true, the prestige grows stronger, the percentage of fat in oil powerfully increases, everything goes on as usual.
Second. Sensory data with low-precision sense data contradict high-level predictions. Bayesian mathematics can conclude that the predictions are correct, that something is wrong with the incoming data (the wrong bees give the wrong honey). Then the lower levels “adjust the data” to the prediction (if the bosses say that this is necessary, then it is necessary). The upper levels continue to adhere to predictions, and Baghdad is still calm.
Third. There is a conflict between the incoming high-precision sense data and the predictions. And here Bayesian mathematics concludes that predictions do not work. The neurons involved in the process (we are now talking about the brain) give a signal “Alarm! Nix! Alyarma! ”, Implying inconsistency, suddenness, unexpected (surprisal). The higher the degree of discrepancy and the higher the "probabilistic weight" of the received data - the larger the unexpected - the louder the metaphorical internal siren shouts.
For high levels, such anxiety is generally news, like a fire alarm for production, where everything is so well established that the bosses do not know what is happening in the shop. Imagine a middle manager in this speculative production, they call him and say that there is a fire in the shop. His first reaction: “Did you agree with the authorities that there would be a fire?”. If yes, then everything is going according to plan, you can continue to drink your fragrant coffee. If not, then you need to think about something. There is an option to blame everything on an attack of delirium tremens at the shop foreman, a bad joke, “wrong number”, “there is nothing to burn there”. And if that doesn't work, then the weekend is ruined, you need
The analogy with the human bureaucratic system is more than appropriate here - at any of these levels HATE to hear the alarm signal, pardon the anthropomorphization. The main task of each of the levels of this hierarchy is to MINIMIZE UNEXPECTATIONS. That is, ideally, it is so good to predict the world so that the probability of the unexpected is minimized, because each such anxiety by surprise is a whirlwind of activity, a general kipish aimed at adjusting the parameters of the generative model of the world - or, in general, the production of new models - until there are surprises will cease and peace and grace will reign again. Continuous energy consumption and vanity. Remember this moment, we will come back to it.
All these processes last a fraction of a second. The lower levels constantly bombard the higher levels with a stream of data, which, based on this data, adjust their hypotheses and lower the predictions. When something goes wrong and a prediction error is registered, the corresponding levels (managers) either change the hypothesis or disturb the higher levels (bosses). After countless such cycles, everything is more or less targeted, predicted, expected, no one is surprised by anything, everything is smooth and clear. Exactly until the next emergency.
Andy Clarke, in his book Surfing Uncertainty, aptly compared this whole predictive processing process to surfing:
“To act quickly and flexibly in an unstable and noisy world, the brain must become a master of predictions - gliding through the waves of noisy and ambiguous sensory stimulation, trying to overtake it. An experienced surfer keeps in the so-called "pocket": close, but slightly ahead of the place where the wave begins to "break". She carries you, but does not catch you. The brain has the same task. By continuously trying to predict the incoming sensory signal, we get the opportunity to study the world around us, think and act in it."

The result is what is called “controlled hallucination” in predictive processing theory. We do not perceive the world as it is, but our predictions about it in the form of expected sensations, corrected by the flow of incoming data. As Anil Seth said in his TED talk, this is our brain's best guess.
Active inference
We figured out an advanced theory of how the brain works - predictive coding) to understand where our expectations come from. We now imagine what is meant by "Bayesian brain". After the examples with absenteeism and fire in production, the diagram below should become clear and understandable for you. It essentially “packs up” the predictive processing process so that you can see which processes are taking place in the brain and which are outside. The brain builds an internal model of the world, on its basis it makes predictions about what should happen, compares the predictions with the information received from the world, corrects the picture of the world, corrects forecasts, the cycle closes. Pay attention to the background color in the picture, everything on beige refers to the external environment, everything on white refers to the internal one. And sensory data and actions are at the edge.
Now let's take a look at another image.

Schematically, almost the same and already almost familiar: model of the world, expectations / forecast, prediction, prediction error, updating the model of the world. Forecasting is just another word for “prediction”. Here we add a border between the system (internal) and the outside world (external), depicted by a dotted line. All the processes we have considered take place INSIDE the system, and actions and sensation are on the border with the outside world.
For convenience, let's simplify this scheme even more.

Sensory states are the very sensations, sensory input, our perception. Active states - actions, actions or behavior. Internal states - internal states of the system, our sensations, the result of the work of all the processes we have considered. Well, external states are the entire set of states of the surrounding world, which is the environment of our existence.
The states of the surrounding world (S) -> determine our sensory states (perception) (o) -> which, having undergone internal processing, become our internal states (sensations) (s) -> which determine our active states (behavior) (a) -> which change the state of the surrounding world, closing this causal (cause-and-effect) chain of events. This is called "terminal active" ( active inference) and, in general, then, is a method of operation of autonomous agents in a dynamic environment.
And here we get to the fundamental question with the highest level of abstraction. Where is the border where the world ends and you begin? One of the most comprehensive and capacious scientific descriptions will be the fence, or, more poetically, Markov's blanket.
We are all Markov blankets
The term "Markov blanket" was coined by the Israeli-American scientist and philosopher Judea Pearl, who is working on a probabilistic approach to AI development and Bayesian networks. Andrei Andreevich Markov (Sr.), whose last name this term bears, was one of the great-grandfathers of the study of stochastic (random) processes and the theory of probability. His son, also Andrei Andreevich Markov (junior), was no less an outstanding mathematician than his father, and gave us, among other things, Markov chains and Markov processes... Markov's “blanket,” or “fence,” is a concept whose application goes far beyond consciousness research and neuroscience - it is even more fundamental. Absolutely anything - anything - exists as a Markov blanket. Because otherwise it would be impossible to draw a line between this something and everything else. If something does not have a Markov blanket / fence, this “something” simply does not exist. Everything in the world we know is Markov blankets, “nested” in Markov blankets, nested in other Markov blankets, and so far as the ability to scale is sufficient.
"If the Markov blanket is minimal, which means it cannot discard any variable without losing information, this is called a Markov boundary."
This is the very border where we end and the world around us begins, and vice versa.

Without any of these components - sensory, internal, or active states - we will not exist as autonomous subjects. Our Markov frontier protects us from the causal complexity of the world.
Free energy principle
What are all living organisms doing in this chaotic, difficult to predict, and, most importantly, nonequilibrium world? First of all, by being directly, that is, by maintaining their boundaries, which separate them from the environment and some kind of internal structure and processes. And for this you need to perceive the world in one way or another (Bayesian mathematics), represent or simply represent it (internal generative model), predict (hierarchical predictive processing) and act (active inference) in order to update your internal generative model ...
We “feel” the world through active inference, create its internal model through predictive processing, update this model (learn) by applying the Bayesian theorem. The last, almost key element remains. All these processes can be reduced to the optimization of a single parameter - the difference between expectation and reality. The whole set of our most sophisticated adaptive strategies comes down to reducing uncertainty. This parameter is called “variational free energy”.
Actually, we have just got acquainted in an extremely simplified form with the "Principle of Free Energy", which in its explanatory power is already considered equal to the theory of evolution by natural selection.
At Charles Freestone, author of "free energy principle" and "predictive processing theory" citation index higher than that of Einstein, 1200+ scientific publications. Everyone, without exception, who has in the slightest degree got acquainted with some of his works, has the impression that he is prohibitively cool.
It is impossible not to appreciate the elegance of the formulation that all living things are a generator of predictions about the states of the surrounding world, which is in the process of self-maintenance and self-organization by separating oneself from the environment and minimizing the error of its predictions .
Drawing parallels, drawing conclusions
Festinger's theory of cognitive dissonance, describing the conflict of expectations with reality and the mechanisms for resolving this conflict, turned out to be the forerunner of new, more complex and large-scale theories. Starting with the explanation of mental processes, they, developing, moved on to the very essence of the adaptive strategies of all living things. A good theory is like a prism - it allows you to see what is hidden from the naked eye. The world we perceive is a generative model built on our brain's guesses about what is happening outside, a controlled hallucination. We cannot avoid this immutable fact, but it is in our power to listen more sensitively to what the senses tell us, not to be afraid to update and complicate our picture of the world.
(c) monocler.ru