Explaining Perception: A Progression
Video Essay Script
This video essay script covers the following topics:
How humans have tried to make sense of perception.
What the brain tells us about experience.
How the body shapes what we perceive.
Why perception is inseparable from attention and action.
Why the world feels immediate and real.
Why our thoughts feel so convincing.
Where our sense of self and freedom comes from.
How subjective minds arrive at shared understanding.
How we can navigate truth and uncertainty.
What a modern philosophy might look like.
I may have gone slightly overboard… apparently perception is no small topic.
Introduction
You open your eyes, and the world simply appears. Intuitively, perception seems straightforward: the sights, sounds, and sensations we experience are direct reflections of the outside world.
But is that really the case?
Today we’re going to trace the progression of how humanity has explained and understood perception: beginning with its earliest philosophical foundations and eventually arriving at one of the farthest frontiers of contemporary science—an explanation far more counterintuitive, biologically grounded, and profoundly useful.
This essay heavily follows and references neuroscientist Anil Seth’s book Being You, alongside a few other foundational resources that you can find linked in the description.
With that said, let’s begin.
Chapter 1: The Assumption of Direct Access (Pre-Cartesian Foundations)
Let’s begin as far back as the historical record allows. Across much of premodern thought, perception was usually treated as a kind of contact with the world itself. Different cultures, religions, and philosophical schools disagreed radically about what reality ultimately consisted of—beliefs they often fought and died for. But beneath those disagreements, there was often a shared assumption: to perceive the world was, in some meaningful sense, to be in direct contact with it.
This does not mean early thinkers were naïve. In Aristotelian and later Scholastic philosophy, the mind was understood to play an active role in abstracting universal concepts from particular experiences [1]—for example, recognizing “tree” from this specific oak. But even here, perception was still generally treated as a relatively direct relation to reality itself.
Early Logical Cracks
Nevertheless, even in antiquity, there were voices recognizing the flaws in this perspective. The Ancient Greek Skeptics, for instance, frequently pointed out that the same world can appear differently depending on the observer [2]: a square tower appears round from a distance, and food tastes bitter to a sick person but sweet to a healthy one.
Furthermore, if perception puts us in direct contact with the world as it is, how do we explain cases where the world appears other than it is—dreams, hallucinations, illusions, or seeing a straight stick appear bent when submerged in water?
Centuries later, during the Islamic Golden Age, the mathematician Ibn al-Haytham argued that perception was not an instantaneous transmission of reality, but the result of rapid interpretive processes [3]—an early articulation of the idea that what we see depends on judgment, not simply a direct copy of the world.
A Scientific Turn
Ultimately, while the era of a broadly direct-realist picture of perception dominated early philosophy, its foundational assumptions were eventually challenged by the Scientific Revolution and the birth of modern optics.
Notably…
In 1604, Johannes Kepler mathematically demonstrated that the crystalline lens of the eye projects a reversed and inverted image onto the retina [4]. If the eye passively takes on the literal form of the world, why do we not experience the world upside down?
Galileo’s telescope and, later, the microscope revealed that much of reality was simply unavailable to ordinary perception [5,6]: some of it too distant, like Jupiter’s moons, and some of it too small, like microscopic life. But Galileo went further: even what we can perceive is not always present in the world exactly as it appears [7]. Qualities like heat, color, taste, sound, and smell, he argued, are not properties of objects in the same way as shape, size, or motion. They arise through the relation between the physical world and the perceiver.
The compounding weight of these discoveries made simple direct-access pictures harder to sustain, demanding a new paradigm: if the mind does not encounter the world directly, then perception must somehow bridge the gap between inner experience and external reality.
Chapter 2: Separation of Mind and World (Descartes and the Problem of Representation)
In 1641, René Descartes split reality into two distinct, fundamentally different substances: the physical world of objects (which possess objective properties like size and shape); and the subjective mind (an entirely separate, non-physical thinking substance) [1].
This dualism helped bring into focus a model known as representationalism: the idea that perception does not give the mind direct access to the world itself, but is instead mediated by internal representations.
This model offered a way to make sense of illusions and hallucinations; if perception is just the mind looking at an internal “painting” of the world, then occasionally the mind simply paints the wrong picture.
The Veil of Perception
However, it inadvertently created a terrifying new barrier, often called the “Veil of Perception”: how much of this internal picture could we actually trust?
This problem helped define one of the central philosophical divides of early modern Europe. Rationalists like Descartes, Leibniz, and Spinoza placed their trust in reason, mathematics, and innate principles, arguing that the senses were too unreliable to ground certain knowledge. Empiricists like Bacon and Locke moved in the opposite direction, insisting that knowledge must begin with sensory experience.
But both sides faced the same underlying problem: if perception only gives us internal representations, how can those representations ever be securely connected to the external world?
Locke tried to solve this by distinguishing between primary qualities, such as shape, motion, and solidity, which he believed existed in objects themselves, and secondary qualities, such as color, taste, and sound, which depended on the perceiver [2]. In doing so, he hoped to preserve a bridge between subjective experience and objective reality.
Collapse of the Cartesian Model
Ultimately, the early representationalist paradigm was decisively weakened by a mix of physics, empiricism, and logic…
The first fatal hole was poked almost immediately by Princess Elisabeth of Bohemia, who wrote to Descartes pointing out a fundamental physics problem [3]: If the mind is an unextended, non-physical substance, how could it possibly exert force to move a physical brain structure without violating the emerging laws of physics?
Later, in 1710, George Berkeley attacked Locke’s defense of objective physics by pointing out that you cannot actually conceive of a “Primary Quality” without a “Secondary” one [4]. Try to imagine a physical shape without a color, or a boundary without a texture to define it. Because our perception of “shape” relies entirely on our subjective perception of “color,” Primary Qualities are just as trapped behind the veil as Secondary ones.
This culminated in the radical skepticism of David Hume, who, in 1739, pointed out the ultimate, fatal flaw of Representationalism [5]: if we only ever have access to our internal representations, how can we ever verify that our representation accurately maps onto the objective world? We cannot step outside of our own minds to compare the internal “picture” to the external “reality”.
By the late 1700s, this problem had pushed European philosophy into crisis. If the mind could not simply mirror the world, then perhaps the relationship had to be reversed.
Chapter 3: Turning Toward the Subject (The Kantian Synthesis)
In 1781, the German philosopher Immanuel Kant published his Critique of Pure Reason in an attempt to rescue human knowledge from skepticism [1].
Up to that point, philosophers generally operated on a single, unquestioned assumption: the external object dictates the terms of perception. They believed the mind simply molded itself to match the world, acting fundamentally as a passive mirror trying, and failing, to accurately reflect a pre-existing reality.
Reversing the Direction of Perception
So, to escape skepticism, Kant proposed a radical reversal: the mind dictates the terms of perception. The objects we experience must mold themselves to the pre-existing structure of our minds.
To explain this, Kant reframed the Cartesian split between mind and world, introducing a new distinction. On one side, there is what he called the Noumenal World—reality as it exists independently of us, which Kant argued we can never access directly. On the other side, there is the Phenomenal World—the world exactly as it appears to us, structured and organized by our human experience [2].
Kant agreed with Hume that if you look out into the physical world, you cannot actually find concepts like Space, Time, and Causality in the raw data. However, Kant argued that we strictly require these concepts to have coherent, unified experiences [3]. He framed them as a priori conditions of possible experience, already supplied by the mind.
Kant’s Phenomenological Legacy
Kant’s framing profoundly altered the trajectory of how we understand perception today. By shifting attention toward the conditions that structure experience, he helped set the stage for Phenomenology—from Husserl to Heidegger and Merleau-Ponty—a tradition devoted to analyzing the structures of lived experience [4]. In turn, many of these thinkers would go on to influence modern theories of mind, perception, and consciousness.
Nevertheless, while Kant’s philosophical framework remains deeply pervasive to this day, it left a massive, unanswered question for the rapidly advancing biological sciences. Kant had successfully argued that the mind supplies the structure of experience, but he believed the mind’s organizing frameworks were “transcendental”—meaning they were logical conditions of reality that could never be physically measured or observed.
However, as the 19th century arrived, a new breed of physicists and physiologists realized that if the mind is an active factory, it must take physical time and physical energy to do that work. The study of perception was no longer merely a question of intuition or philosophical logic; it became a rigorous, measurable question of biology and neuroscience.
Chapter 4: Perception as Inference (Helmholtz and the Psychophysical Turn)
In the 1860s, the German physicist and physician Hermann von Helmholtz transformed Kant’s abstract insight about the mind’s active role in perception into a rigorous physiological research program [1,2].
By deeply analyzing the anatomical structure of the eye and applying the era’s understanding of optics, Helmholtz recognized that the image projected onto the retina is profoundly limited: a two-dimensional, inverted projection, riddled with blind spots, unstable eye movements, and sensory noise.
Yet, we experience a stable, seamless, three-dimensional reality. The rich, continuous world we experience contains significantly more information than the sparse data actually entering the eye.
The Brain’s Hidden Work
To solve this, Helmholtz argued that the brain must be doing a significant amount of hidden work to fill in the gaps. When the brain receives sparse, ambiguous sensory cues, it must work backwards to determine the most probable cause of that data, relying on prior experiences, learned patterns, and structural expectations about how the world works.
Helmholtz coined the term “unconscious inference” to describe this mechanism, and he chose the word “unconscious” deliberately, noting that we have absolutely no conscious awareness of this underlying process. He argued that the brain acts as a statistical engine, generating perceptual “best guesses” about the external causes of our sensory data and, crucially, continuously updating those guesses as new data arrives [3].
This was the crucial turn towards a science of perception. Helmholtz gave the old philosophical idea of an active mind a biological foothold: perception could now be studied as a hidden physiological process, not merely debated as an abstract problem of knowledge.
This central thesis of “perception as inference” would quietly survive the next century, eventually becoming the foundational bedrock upon which some of our most cutting-edge models of the mind are built today.
Chapter 5: The Brain as a Computer (The Cognitive Revolution)
As the mid-20th century arrived and early computing machines became a reality, cognitive science increasingly leaned on the computer as its guiding metaphor. This movement, known as Cognitivism, framed cognition as information processing: the brain receives inputs, transforms them according to internal rules, builds representations of the world, and uses those representations to guide behavior [1,2,3]. In its most classical form, the mind looked like software running on neural hardware: what mattered most was the abstract logic of computation, while the biological details of the brain were frequently treated as secondary.
Bottom-Up Processing
With the emerging modern understanding of physics, scientists still recognized that the raw input hitting the brain does not arrive with convenient labels announcing their properties, like “I am a pen”. Crucially, they do not even announce whether they are visual, auditory, bodily signals. Once received by the sense organs, they are all converted into electrical noise.
To turn this noise into a cohesive world, many influential models emphasized a “feed-forward”, or bottom-up, process [4]. Stimuli strike the sensory organs and cause electrical signals to flow “upward” or “inward” through distinct processing stages, with each stage extracting increasingly complex features.
Take a car, for example. Light bounces off the car and hits our retina. The lowest, earliest visual stages of the brain extract raw luminance and straight edges. The signal moves up to the next layer, which assembles those edges into basic shapes like circles and rectangles. It moves higher to recognize object parts—like wheels, windows, side mirrors—until finally, the deepest cognitive stages match this assembly to a memory and categorize the whole object as a “car”.
This feed-forward model had robust empirical backing. Decades of experiments investigating the visual systems of cats and monkeys repeatedly showed that neurons at early visual stages fired only for simple edges, while neurons at later stages responded to complex features like faces. Modern fMRI studies in humans have revealed much the same thing [5].
Through this view, top-down processes like attention or memory act like a filter throughout the processing pipeline; higher cognitive centers could even reach down and tune or gate the signals at earlier stages of the hierarchy. Yet, crucially, these top-down signals were only serving to refine, enhance, or inhibit the foundational bottom-up payload. The actual generation of reality was still fundamentally flowing upward from the sensory data.
This narrative of the brain as a bottom-up receiver is so intuitive and conceptually clean that it remains the default way most people still describe the brain today—including many psychologists, neuroscientists, and standard biology textbooks.
Growing Resistance to the Filter Model
Nevertheless, while this bottom-up filter model became a powerful mainstream default, it faced mounting resistance from several directions…
Dreams and hallucinations posed an obvious problem: if perception depends primarily on filtering bottom-up input, then why does the mind generate vivid perceptual worlds when the associated sensory input seems to be missing? For example, in 1954, sensory deprivation experiments at McGill University showed that when ordinary sensory input is severely reduced, the mind can begin generating vivid perceptual experiences of its own [6].
Later, the math itself began to shift. By the 1980s, the serial computer metaphor was challenged by Parallel Distributed Processing, which modeled cognition using artificial neural networks inspired by the brain’s architecture. While still viewing the brain as a computational machine, this demonstrated that the brain likely computes via distributed weights and probabilities rather than linear, symbolic logic [7]—laying the necessary groundwork for hierarchical models to come.
Furthermore, researchers mapping actual brain tissue found wiring diagrams that looked absolutely nothing like standard computer processing units. In 1989, neuroanatomists discovered that the neocortex is built out of “canonical microcircuits”—densely interconnected, massively recurrent loops where processing and memory are physically enmeshed [8,9].
Pushback from the Brain’s Hardware
While these varied disagreements planted seeds of doubt, the cognitivist establishment remained resolute. Nevertheless, as researchers learned more about the brain’s architecture, purely feed-forward accounts became harder to sustain.
The first major crack appeared in the 1990s, when researchers began mapping the primate visual system in detail [10, 11]. If vision were mostly a one-way stream—from the eyes, to the brain, to consciousness—we might expect most of the wiring to carry information forward from the retina. However, they found in key visual pathways that there were far more connections running backward from the cortex than running forward from the eyes. Much more of the wiring came from the brain itself, sending feedback down into the system.
If the brain were merely a bottom-up receiver using top-down pathways as filters, how could we make sense of this? In signal processing, a bottom-up cable carries the high-bandwidth “payload”, and the top-down mechanism requires only a low-bandwidth control signal to modulate it. Having 10 times the bandwidth flowing backward suggested that the top-down signal was carrying more than mere modulatory instructions.
One of the strongest challenges to a purely feed-forward model came a few years later with a serious latency problem. If the brain had to build perception from the bottom up and only later send feedback down to refine it, recognition should require hundreds of milliseconds of complex, multi-directional processing.
Yet, in 1996, EEG studies showed that people can categorize a novel, complex visual scene in under 150 milliseconds [12]. That is barely enough time for the electrical signal to reach the back of the head, let alone travel back down to affect filtering. The brain seemed to be recognizing the world faster than a slow, bottom-up assembly line could plausibly explain.
Taken together, these observations make far more sense if the brain is not waiting passively for sensory data to arrive, but is already preparing an expected perception in advance.
Chapter 6: Formalizing Prediction in the Brain (The Neural Architecture of Perception)
As we cross into the 21st century, the shape of our story changes. Up to this point, we’ve watched a progression of competing paradigms; however, from the 1990s onward, this narrative changes. We are now watching one increasingly influential framework expand outward: from perception, to action, to the biological regulation of the organism itself. From here on out, each phase will widen the frame and raise the stakes—but let’s not get ahead of ourselves.
By the late 1990s, the latency problem had effectively weakened the feed-forward filter model, and cognitive science had to reach back a century and resurrect Helmholtz. Neuroscientists began to suspect that the old picture had the flow of information partly backwards: the brain was not merely filtering reality; it was predicting it.
While Cognitivism dominated the 20th century, the predictive view had quietly survived in the background. By the 1970s, perception was increasingly theorized as a form of hypothesis-testing, with the brain generating best guesses about the causes of sensory input [1].
By the 1990s, this idea was formalized into the “Bayesian Brain hypothesis”, which posited that the brain acts as an inference engine, constantly updating internal probabilistic models of the world [2]. Researchers showed that when visual data is made intentionally blurry or noisy, humans tend to rely more heavily on prior expectations, unknowingly executing roughly Bayesian formulas surprisingly well [3].
These perspectives provided the necessary mathematical baseline for the new paradigm: the brain is in the business of guessing.
Predictive Coding
But how does the brain actually execute these guesses?
In the early 1990s, researchers began proposing that the brain’s massive feedback pathways might be carrying predictions downward, while feedforward pathways carried prediction errors upward [4].
To test if the brain actually operated this way, computational neuroscientists Rajesh Rao and Dana Ballard built an artificial model of the brain with a single, rigid mathematical constraint: the network must constantly generate top-down predictions, and it is only allowed to pass forward the errors of those predictions. By design, when incoming bottom-up sensory data closely matched the top-down prediction, the resulting prediction error was suppressed and, in this technical sense, “explained away”.
Astoundingly, when Rao and Ballard trained this network by feeding it photographs of the natural world, the modeled artificial “neurons” spontaneously organized themselves to behave exactly like the living neurons in a mammalian primary visual cortex. Specifically, the model naturally replicated highly complex neuronal firing patterns that had puzzled neuroanatomists for decades.
It suggested that the brain’s physical architecture could be understood as a system trying to minimize prediction errors—an architecture that helped address the Latency Problem.
To be clear, this perspective substantially reframes the older filter picture. In a simple filter model, perception begins with a bottom-up signal that gets refined, enhanced, or suppressed on its way to consciousness. But in this view, the brain is constantly trying to predict that signal in advance; when incoming data matches the prediction, much of it is treated as already explained, and the signal sent upward is mostly the remaining mismatch between the model and the sensory input.
In 1999, Rao and Ballard published a seminal paper formalizing this concept into what is now known as Predictive Coding [5]. It gave computational neuroscientists a functional algorithm they could actually test and build upon, reframing perception less as filtering incoming data and more as correcting prediction errors.
Evidence for Top-Down Generation
Nevertheless, falsifying an old model and providing plausible math is one thing; observing it in an actual living brain is another. To convince the establishment, researchers needed evidence that feedback signals could carry content-relevant information, not merely attentional modulation.
That evidence only began to emerge in the 2010s, when imaging technology became precise enough to examine the relevant cortical layers in detail. A series of influential fMRI studies demonstrated that feedback signals can contain content-specific information, using a technique called visual occlusion [6].
Imagine looking at a picture of a dog, but its lower half is hidden behind a brick wall. Because the dog’s legs are occluded, our retina sends zero bottom-up data about them to our visual cortex. Under the old filter model, the visual cortex mapping that occluded area should be quiet, or merely filtering the “brick wall” signals.
Instead, these studies found that top-down feedback appeared to fill the specific “blind” region of the primary visual cortex with neural activity corresponding to the missing legs. And in 2015, using an even more precise form of brain imaging, researchers showed that this predicted content was arriving specifically in the anatomical layers associated with top-down feedback [7].
This was striking evidence that the brain was doing something much closer to filling in the scene than merely filtering the incoming signal. Predictive coding had moved from an elegant computational theory toward a biologically plausible, increasingly evidence-backed account of perception.
Chapter 7: The Broader Framework of Predictive Processing (How the Brain Generates Experience)
By the early 2000s, these localized insights crystallized into the broader framework of Predictive Processing [1]. Following the discovery that the neocortex shares a remarkably uniform vertical structure, researchers began to explore whether predictive principles might operate across the wider cortical hierarchy—and growing evidence now suggests that related dynamics may even reach beyond the cortex, into deeper subcortical systems.
Prediction Across the Hierarchy
To understand how this affects our perception, we can look back at the brain’s division of labor. The lowest levels of the brain (the primary sensory cortices) deal with raw, fast, granular data, and the highest levels deal with slow, broad abstractions.
Under the predictive processing framework, perception is heavily shaped from the top down. The most abstract areas of the brain generate high-level hypotheses—such as “I am sitting at my desk.” This high-level prediction cascades downward, shaping the predictions of every level below it. Visually, the next level down might predict the presence of specific objects, like my computer monitor. That level sends a prediction further downward to expect the visual features of that object—a glowing white rectangle. Finally, that prediction reaches the earliest visual layers, which anticipate basic features like lines, edges, brightness, and contrast.
When incoming sensory input conflicts with this cascade of predictions, prediction errors are passed upward [2]. When the input fits the predictions, there is little error to report, and the signal is effectively “explained away.”
So the huge takeaway from this view is that higher-level predictions actively shape what lower levels of the hierarchy are prepared to see, hear, and feel. This means that the perceptual hierarchy is not an accumulating pipeline of interpretation, where higher layers are handed a progressively richer picture of the world. Instead, each layer predicts the activity below it, explains away what it can, and passes upward only what remains unresolved.
This is why, as we discussed with the occlusion studies, the visual cortex can represent the hidden legs of a dog: once the brain infers that a wall is occluding part of the animal, it can predict the most likely continuation of the scene behind the wall, generating visual activity that corresponds to the dog’s hidden legs.
This architecture also helps explain decades of behavioral psychology experiments demonstrating that humans perceive expected stimuli significantly faster than unexpected stimuli. In classic psychophysical studies, subjects identified images of houses or faces roughly a tenth of a second faster when primed to expect them [3].
Even more incredibly, neuroimaging shows that expected signals actually show up “sharper” in the visual cortex—i.e., more physically distinct [4]. Top-down expectations can influence early stages of visual processing, effectively priming perception. If we expect our keys to be on the table, we will more readily notice them.
Before we go further, it’s crucial to clarify that what we mean by the word “prediction” here is not necessarily conscious anticipation—like trying to guess what we will have for dinner tomorrow. “Prediction” here refers to an automatic, micro process occurring millions of times a second across billions of neurons, completely beneath our conscious awareness. We do not feel these individual predictions.
Precision Weighting, Attention, and Learning
So how do we connect this underlying machinery to our intuitive, everyday experience?
With millions of predictions happening constantly, the brain needs to know when to trust its internal model, and when to trust an incoming error signal. According to predictive processing, the brain manages this by estimating the reliability of predictions and bottom-up data—a mechanic known as Precision Weighting.
Consider reading a text message with a typo. Because you know your friend and understand the context of the conversation, your brain forms a strong expectation about what the sentence is supposed to say. The brain effectively turns down the volume on the mismatch. If the typo is small enough, you might not even notice it.
However, if you then hear someone talking in the other room, and you thought you were home alone, that signal is treated as sharp, sudden, and highly reliable; the brain cranks up the volume.
As you may have noticed, this physiological amplification intuitively corresponds to our sense of attention. To “pay attention” is, in part, for the brain to increase the gain on certain signals so they become harder to ignore and easier for wider systems to use. Under the hood, one major system involved in this process is the salience network, which helps detect signals that are important, unexpected, or behaviorally relevant.
We can connect this, cautiously, to one of the leading neuroscientific accounts of conscious access: the Global Neuronal Workspace Theory, which proposes that information becomes consciously accessible when it is amplified, synchronized, and made globally available across distributed brain systems [5]. So, from a predictive processing perspective, when localized unconscious processing fails to resolve a significant prediction error, systems involved in salience and attention can effectively sound an alarm: “We need more hands to make sense of this!” The unresolved signal is broadcast more widely, allowing distributed systems to coordinate around resolving it.
If you are interested in learning more about that theory, I explored it in another video.
Okay, so back to our example: if the brain determines that things are getting worse—perhaps you don’t recognize the voice—uncertainty increases, the body initiates a stress response, and the brain redirects your attention.
But if the brain determines that things are better than expected—you realize the voice is only your roommate—uncertainty collapses. What a relief.
Now, you may have already learned that dopamine is not strictly a “pleasure” molecule, but it is deeply tied to reward, motivation, and learning. Why is that? Because dopamine helps the brain update which predictions are worth trusting in the future. Roughly speaking, when activity flows through the brain, it leaves behind temporary traces of which circuits were involved. Then, when a salient uncertainty is successfully resolved, dopaminergic signaling floods those tagged circuits, altering their plasticity—potentially making the connections more likely to strengthen, stabilize, or be reused in the future.
By changing the physical weighting of these connections, the brain alters its expectations for the next time it enters a similar situation. You hadn’t remembered that your roommate told you they now work from home on Tuesdays, but now you will.
Just to hammer home this central predictive processing thesis: the brain physically updates the neural architecture underlying its generative model to minimize future prediction errors. It generates top-down predictions to suppress bottom-up errors, and when those predictions fail, it updates the model so the same error is less likely to arise next time.
Controlled Hallucination
Now let’s consider the ultimate implications of this paradigm.
Because conscious perception depends so heavily on top-down predictions, the brain’s internal generative model supplies much of the structure of what we see, hear, and feel. But this does not mean reality is a sheer fantasy. Sensory input continuously constrains the model, steering the trajectory of perception in real time and reshaping the model through learning.
To make sense of this dynamic, neuroscientist Anil Seth famously uses the provocative metaphor of a “controlled hallucination” [6]. Perception is hallucination-like because it is generated from the inside out; but it is controlled because the brain’s predictions are continually disciplined by prediction errors largely driven by sensory input.
Another way to say this is that perception emerges from a continuous coupling between organism and environment. The brain generates predictions about the world, the world pushes back through sensory input, and the brain updates its model through that ongoing loop. Our experienced reality is generated internally, but it is constantly negotiated with the physical world. This helps explain why the contents of experience seem empirical: they appear stable and observable because sensory input continuously constrains the brain’s predictions.
Nevertheless, our internal model fundamentally shapes how the world becomes perceivable to us in a given moment and context. Perception depends on expectations that allow the brain to organize incoming signals into something coherent, stable, and usable. Some of these expectations are shaped extremely early: for example, within the first months of life, infants become increasingly tuned to faces, voices, bodies, movement, depth, touch, and other foundational patterns of experience [7]. Others are conditioned more gradually, as when we come to perceive a car as an object with a name, a use, and a set of personal associations.
Notably, these predictions unfold across many scales at once—linking sight, sound, touch, movement, and memory into a single coherent situation. When we are walking through our house, our high-level contextual prediction of “I am home” primes our primary visual cortex to expect the precise dimensions of our hallway in the dark, tunes our auditory cortex to instantly ‘explain away’ the familiar hum of our fridge, and prepares our somatosensory cortex for the exact height of our staircase. If we were instead wandering down a city street, those exact same sensory signals might be suppressed as low-probability noise.
This dynamic helps us understand why human perception is so notoriously fallible. It tells us why we can often see things that aren’t physically there; if a top-down prior is overwhelmingly strong, or improperly assigned high precision, it can effectively override sensory input, resulting in what we normally call a hallucination.
Conversely, it explains why we can fail to see things that are physically right in front of us. If attention and salience systems do not treat a piece of sensory data as relevant—like failing to notice a gorilla walking through a basketball game while we are busy counting passes—it may never become consciously noticeable. The photons hit the eye, but the prediction error is squashed before it ever reaches our conscious awareness.
The Dark Room Problem
Overall, predictive processing has become one of the most integrative and influential frameworks in cognitive neuroscience [8]. It elegantly links the brain’s physical wiring to perception, attention, learning, and many features of conscious experience.
However, as this framework grew in prominence, skeptical observers pointed out what seemed like a glaring philosophical flaw: it lacked a “why”.
If the brain primarily minimizes prediction errors, the easiest and most logical way to achieve that would be to sit in a dark, silent room and do absolutely nothing forever. Why would a system designed strictly to squash surprises ever initiate a motor command to get up, walk into a chaotic world, and actively risk encountering the unknown?
Chapter 8: Biological Regulation at the Heart of Perception (Free Energy, Active Inference, and Allostasis)
One of the most ambitious—and controversial—attempts to solve the Dark Room problem comes from Karl Friston, one of neuroscience’s most influential and polarizing figures.
Friston considered that if a biological system’s fundamental goal is to minimize prediction errors, there are actually two mathematically viable ways to do it. The first method is to change its internal model to match the outside world. This is learning—the exact processes we just covered.
But there is a second method: it can act on the world so that incoming sensory data matches its predictions. If our brain expects us to get up, walk to the kitchen, and turn on the light—perhaps because our body needs food, or warmth—then sitting motionless in the dark would not be a perfectly error-free state. It violates the predicted sensations of moving, seeing, reaching, and ultimately getting food and warmth. The system can reduce that mismatch by acting until the world conforms more closely to its predictions. This is what he calls Active Inference.
The Free Energy Principle
Intuitively, we understand that organisms act because they have to keep themselves alive. Friston’s deeper ambition was to ground that biological drive in something even more fundamental: the basic laws of physics governing how organized systems persist over time.
This framework, first proposed in 2006, is called the Free Energy Principle [1,2]. Before we unpack it, though, we should be clear about the kind of explanation it offers. Rather than directly mapping perception onto concrete brain mechanisms, it provides a higher-level interpretation: treating perception, action, and life itself as parts of the same self-maintaining process. That breadth is what makes it powerful—and also what makes it controversial, as we’ll return to later.
To make sense of this full story, we have to begin with a basic fact of physics: organized systems tend to fall apart. Left alone, they lose structure, decay, and drift toward equilibrium with their environment.
A rock can persist for a long time because its atomic bonds keep it within a relatively stable range of states. But it does not actively maintain itself. So, over time, it will crack, erode, weather, and gradually lose its structure.
Biological life, however, is a far more improbable kind of physical system. A dynamic, living organism must actively preserve the boundary between itself and its environment. If its internal temperature drops too low or its chemistry falls out of balance, it no longer persists as a living system and eventually disintegrates into the environment.
In Friston’s framework, an organism does not have direct access to “the world itself” or even the complete state of its own body. It only ever encounters both through sensory states: signals from the outside world, like light, pressure, chemicals, and sound, and signals from inside the body, like temperature, glucose, pH, and organ state.
So, maintaining the narrow bounds of survival can be framed mathematically as minimizing variational free energy: a measure of how far the organism’s sensory states deviate from what its own model expects to be safely within its viable range. To minimize free energy, then, is to avoid or resolve states that are incompatible with that organism’s continued survival. In this sense, a fish “expects” to be in water; being out of water is a statistically surprising state.
Regulation Through Homeostasis
How did early, brainless organisms avoid these survival-threatening states? They didn’t. Systems that failed to minimize sensory surprises simply dissolved into entropy—they died. The only systems that persisted were the ones that randomly developed physical mechanisms to react to these sensory stimuli in ways that kept their boundaries intact. This is the blind, ruthless filter of natural selection at work: a central driver of evolution.
This evolutionary filter gave rise to homeostasis—the raw, direct regulation of a body in response to sensory stimuli. When an organism registers that its internal state is drifting toward an unsurvivable boundary—say, its internal temperature is dropping—the system detects a survival-relevant mismatch between its current state and the state it must maintain. To reduce that mismatch, the organism reacts: it moves toward heat, changes its posture, or initiates a bodily reflex.
To be clear, this does not mean the organism consciously “wants” to be warm. At this level, there is no intention—only physics. If a chemical system happens to react to cold in ways that move it toward heat, its boundary is preserved and it persists. If its reactions push it outside its viable range, that boundary breaks down and the system dissolves back into the environment.
All right, so far, not too crazy. Now let’s consider, if this mindless, homeostatic loop works so well for simple organisms, how did evolution ever end up inventing more complex brains?
It’s crucial not to smuggle purpose into evolution here. There is no universal force pushing life towards greater complexity. Many simple organisms, like bacteria, are wildly successful precisely because they are robust. They can survive enormous fluctuations in temperature, chemistry, and acidity, minimizing surprise through brute physical tolerance—i.e., fewer environments push the organism outside its viable range.
However, in specific, highly competitive environments, some organisms faced evolutionary pressures that sacrificed this raw robustness for greater adaptive flexibility. Instead of surviving mainly by tolerating a wide range of conditions, they evolved complex bodies and nervous systems capable of modeling a richer, more volatile world. This opened up new ways of acting, predicting, and adapting—but at the cost of narrower survival bounds. Bacteria can survive freezing and thawing; a human can die if its core temperature drops by just ten degrees.
Because their survival bounds were now so fragile, these complex organisms could no longer rely on slow, brute-force physical reactions. Raw sensory signals are too noisy, ambiguous, and low-level to guide behavior on their own. To survive, nervous systems had to simplify this chaos by inferring the hidden causes behind the signals: the underlying things, events, or conditions producing them.
We can think of this as the root of abstraction. Instead of reacting to thousands of shifting photons, vibrations, or chemical fluctuations, the organism can infer a more useful cause: food, predator, shelter, or threat. These abstractions function as survival heuristics: compressed, action-guiding guesses about what matters and what to do next. And as nervous systems became more complex, they came to model the world at higher and higher levels of abstraction—not just as edges and shapes, but as objects, situations, threats, tools, and possible actions.
Valence and Affect
This type of abstraction does not only apply to the outside world; the organism also has to infer the hidden causes of its internal signals. A drop in glucose, a shift in temperature, or a change in pH does not arrive with an explanation attached. The nervous system has to infer what these signals mean: Am I hungry? Am I sick? Am I in danger?
These inferred causes help the organism regulate itself, but they still leave a deeper problem: how does the system estimate whether it is doing well or badly overall? An organism has no god’s-eye sensor that can calculate exactly how far it is from every possible state it could occupy. So it has to use sensory evidence to model the hidden causes of its overall condition, with internal bodily signals playing a privileged role in providing the most immediate evidence of whether the organism is within its viable range.
This rough overall estimate is what we can call valence: an abstracted estimate of whether the organism is doing “well” or “badly” relative to its own regulation [3]. Rather than representing every bodily variable separately, valence compresses the organism’s condition into simple action-guiding heuristics: move toward this, move away from that, keep doing this, stop doing that. And as nervous systems become more complex, these basic “better” and “worse” signals can guide increasingly sophisticated forms of regulation, helping the organism decide not just whether to act, but how to act in more complex situations.
Crucially, as nervous systems evolved, they physically wired these internal and external circuits together into one unified regulatory system. An external stimulus can directly alter the body’s internal physiology. For example, when the nervous system detects a predator, it can trigger a defensive cascade—releasing adrenaline, increasing heart rate, tensing muscles, and preparing the body to move. That internal shift is then registered, at a basic regulatory level, as negative valence: something is wrong, and action is needed.
This idea of valence helps provide a biological foundation for affect: the felt, bodily sense that our state is either favorable or dangerous relative to our own regulation [4]—the immediate sense of comfort, distress, tension, or vitality that colors experience long before conscious thought gets involved.
More recently, some theorists have proposed that affective valence may track the rate of change of uncertainty over time. When salient errors are resolving, that positive shift may be felt as relief, satisfaction, or ease. When uncertainty grows and errors accumulate faster than expected, that negative shift may be felt as anxiety, frustration, or dread.
So at the highest level, the organism does not need to calculate every variable threatening its survival. It can regulate itself through abstract evaluations of “better” and “worse.” These evaluations cascade down through the hierarchy, shaping autonomic reflexes, attention, and motor patterns that move the organism away from danger and back toward viable states.
Regulation Through Allostasis
Now, looking at this regulatory process as merely reactive misses a crucial point. If an organism waits until it sees a predator before preparing to run, or waits until its blood sugar is already dangerously low before seeking food, it will more likely die.
As complex nervous systems evolved, they developed more advanced mechanisms for allostasis: the ability to regulate the body by predicting future needs and acting before a crisis arrives [5].
For an intuitive example, we don’t feel thirsty only when our cells are already dehydrated. We feel thirsty when the brain predicts that, without water, the body is moving toward a dangerous state. We can see the same predictive logic from the other direction: when we are extremely thirsty and drink a glass of water, the feeling of thirst can fade almost immediately, even though it takes much longer for the water to fully hydrate our tissues.
Neuroscientist Lisa Feldman Barrett has famously described allostasis as “body budgeting” [6]. The brain acts like an accountant for the body, constantly forecasting future metabolic needs—water, glucose, salt, oxygen—and making behavioral withdrawals or deposits ahead of time so the system does not go bankrupt.
So, in the language of the Free Energy Principle: If the organism survives by estimating and reducing free energy in the present, allostasis means projecting that same problem into the future. The system can effectively consider: If I do this, what states am I likely to end up in? Will they keep me within viable bounds, or push me toward danger?
In Friston’s framework, this is known as minimizing Expected Free Energy: selecting actions, or “policies,” expected to keep the organism within preferred and informative states [7].
More intuitively, the brain can evaluate action through abstract conditional possibilities: If I scan my eyes across the desk, what am I likely to see? If I reach toward the pen, what sensations should follow? If I stay still, what will happen next?
Which brings us back to Active Inference. Action is not separate from prediction; it is one of the main ways predictions become true. The organism can move its body, shift its attention, and change the world so that future sensory input better matches the states it expects and needs [8].
Consider a simple action, like picking up a pen. Intuitively, it seems like the brain first forms an intention, then sends mechanical commands down to the muscles. But from the perspective of Active Inference, the brain can be understood as generating predictions about the sensory state it expects: my hand is reaching, my fingers are closing, the pen is in my grip. The body acts accordingly to reduce the mismatch between the current state and the predicted one, bringing the body into alignment with the intended action.
And what makes one action better than another? According to the Free Energy Principle, an action can reduce future uncertainty in two broad ways: exploitation or exploration [9].
Exploitation means using what the organism already knows to reach a preferred state. We go to the fridge because our model predicts there is food there, and because eating will move the body toward a better regulatory condition. In this mode, the organism acts from a model it trusts: predictions are precise, reliable, and repeatedly confirmed by the world. Intuitively, exploitation feels like comfort, safety, mastery, and routine.
But if the organism relies only on exploitation, its model begins to stagnate. If it sits forever in a dark room, its model of the outside world begins to decay, leaving it less prepared for future threats, opportunities, and bodily needs. The future becomes harder to predict, which means Expected Free Energy can rise.
In this sense, the brain expects a manageable level of complexity. When the environment becomes too predictable or information-poor, the system starts seeking information again. And intuitively, this can show up as boredom: the restless feeling that pulls us towards new stimulation.
This is where exploration becomes necessary. The organism samples the environment, gathers new sensory evidence, and tests the boundaries of its model. Small, manageable surprises now help prepare the system for larger, more dangerous surprises later.
Importantly, exploration and exploitation are not two separate modes we occupy for long, uninterrupted stretches of time; the brain is constantly moving between them. As we navigate the day, we exploit familiar patterns to pursue a goal, encounter small discrepancies, explore just enough to resolve them, and then return to the stability of what we know.
This tension between exploitation and exploration may sound familiar. Long before it was formalized in computational neuroscience, many psychological and symbolic traditions noticed this pattern in one form or another, often representing the familiar as habitual, orderly, or structured, and the unfamiliar as attention-grabbing, chaotic, threatening, or transformative. From the perspective of active inference, this reflects a real regulatory problem: organisms must preserve the stability of the reliable, exploitable world while still needing to remain open to the informative—and potentially terrifying—unknown.
With these pieces in place, a remarkably unified picture of human perception begins to emerge. We can think about perception as part of the broader biological process of the ongoing regulation of the organism in the service of survival.
Action and perception are inseparable: two sides of the same self-regulating loop. The organism acts to sample the world, and sensory signals from the world help calibrate what the organism does next. Every movement, glance, and shift in focus helps keep the body within its viable bounds.
Perception itself, then, can be understood as an active, predictive, allostatic process—an integral part of the organism’s continuous effort to anticipate, avoid, and correct states that would push it outside its viable range.
Reflecting on the Full Biological Framework
Now let’s retrace the full logic from the ground up: Thermodynamics sets the unavoidable boundary conditions: organized systems tend to decay, lose structure, and drift toward equilibrium. Evolution blindly selected for living systems that could resist that drift by maintaining their own organization. Sensory states gave organisms their only point of contact with the external world and the body, while nervous systems learned to infer the survival-relevant conditions behind that sensory chaos. Finally, allostasis describes the future-oriented nature of regulation: the organism not only reacts to present conditions, but budgets its energy and acts preemptively against predicted threats.
This gives us a powerful unified picture, but it is important to separate two ideas that can easily blur together: the biological process of allostasis, and the broader mathematical framework of the Free Energy Principle [10].
Allostasis is a biologically grounded process: substantial evidence suggests that organisms regulate bodily variables through prediction, preparation, and action. The Free Energy Principle is more ambitious, as it tries to describe why self-maintaining systems must regulate themselves in the first place. This is why it receives criticism: because the framework is so abstract, it can be difficult to falsify directly. The worry is not that it explains nothing, but that it may explain too much: if almost any adaptive behavior can be retrospectively described as “minimizing free energy,” then it becomes hard to specify what observation would actually prove the framework wrong.
So, the Free Energy Principle is best understood here as a normative principle: an interpretation of what a persisting system must, in theory, do to maintain itself over time. While it does not, by itself, identify the specific mechanisms an organism uses to solve this abstract problem, it does provide a powerful framework for hypothesizing what biological systems might look like if they were organized around self-maintenance, prediction, and regulation.
And in this regard, its influence is difficult to deny—driving major currents in computational neuroscience and helping weave many otherwise isolated anatomical and behavioral findings into a more cohesive picture.
Looking specifically at human perception: Predictive Processing, especially through the cortical hierarchy, offers one possible architecture for this error-minimizing logic: when predictions fit the incoming signal, prediction error is reduced; when they fail, error is passed forward. Salience and attention can then be understood as precision-weighting systems, helping determine which errors matter enough to recruit broader processing and, in some cases, conscious access.
Crucially, these biological mechanisms can be tested in ways the abstract Free Energy Principle itself cannot. If neurobiologists discovered that the brain does not use sensory prediction errors to shape perception, or that attention does not modulate the precision of those errors, these mechanistic models would be in serious trouble. But as it stands, they offer one of the most integrative pictures we have of perception as a biological process.
And perhaps the most exciting possibility is that this framework might help bridge the gap between the biological mechanisms of self-maintenance and the subjective feeling of being a living body in an objective world.
Chapter 9: Implications for Conscious Perception (Why the World Feels the Way It Does)
This is the ultimate payoff for the journey we have taken so far. With the physical and biological mechanics now in view, we can finally turn to the question that has been waiting underneath all of this: why perception feels the way it does.
To prepare us for this, we can once again follow Anil Seth’s Being You by highlighting a famous historical exchange between the philosopher Ludwig Wittgenstein and his colleague, Elizabeth Anscombe:
Wittgenstein: “Why do people say that it was natural to think that the sun went round the Earth rather than that the Earth turned on its axis?”
Anscombe: “I suppose, because it looked as if the sun went round the Earth.”
Wittgenstein: “Well, what would it have looked like if it had looked as if the Earth turned on its axis?”
The profound, uncomfortable answer is that it would have looked exactly the same.
Our discussion points us toward a strange implication: conscious experience does not give us direct, unvarnished access to objective reality. The way things “look” or “feel” is not necessarily evidence that perception is a transparent window onto the world as it exists in itself.
As we have seen, conscious perception is shaped by an amalgamation of top-down predictions about hidden causes and bottom-up sensory error signals that constrain, correct, and update those predictions. To borrow Anil Seth’s metaphor, perception is, in this sense, a controlled hallucination. But once we bring allostasis and action into the picture, the metaphor deepens: Seth also describes perception as a controlling hallucination. It meaningfully affects our ongoing processing, and thus shapes the trajectory of our attention, action, and experience.
This suggests that conscious experience is not functionally arbitrary. Conscious awareness is biologically expensive, so what becomes conscious likely involves signals the organism needs to coordinate around. Treated in information-processing terms, this global availability has a practical role: it makes highly abstracted contents available across perception, attention, memory, and action, helping coordinate the body’s ongoing regulation.
Now, when confronted with the idea that the brain generates its own reality, a common concern emerges: “If reality is subjective and my brain is just hallucinating all of this, why can’t I just choose to see the sky as green? If it’s all in my head, why don’t I just jump off a cliff and hallucinate that I can fly?”
The answer lies in the fundamental difference between a mere hallucination and a controlled hallucination anchored by allostasis. We cannot simply choose to perceive a green sky or survive falling off a cliff because the brain is a biological organ embedded in a body, constrained by sensory evidence and the need to remain within viable bounds. The hallucination is controlled because the body cannot simply ignore the constraints that its model depends on to stay alive and well.
If the brain randomly generated a hallucination of a green sky, it would likely create conflict with sensory evidence, capturing our attention while serving no obvious allostatic purpose—it might even distract us from something more important. Worse, if the brain hallucinated that gravity does not apply to us, and we stepped off a cliff, we would die. The body budget goes bankrupt, and the living system loses the organization it was fighting to maintain. So, we consciously and viscerally feel that jumping off a cliff would hurt because the model has to treat that fact as non-negotiable in order to keep us alive.
The same principle applies inwardly. We do not explicitly experience ourselves predicting, minimizing uncertainty, or performing the biological work of staying alive. Being consciously aware of millions of micro-predictions and sensory conflicts would be metabolically wasteful and practically useless. We do not need to know how the brain builds a model of a predator; we just need to know when a predator is there.
The Intuitive Structure of Experience
Now let’s consider: why does this continuous regulatory loop result in the vivid, undeniable conscious experience of being a stable self looking out at an objective world?
Here we have to be a little more speculative. We cannot know exactly why evolution settled on the specific details of human experience, but we can use the framework we have built to ask how the brain’s fundamental allostatic drive might shape the most pervasive, intuitive aspects of our daily life.
To understand what the brain eventually makes conscious, we have to remember that the nervous system evolved to compress an overwhelming stream of sensory stimuli into stable, action-guiding patterns. The brain models the hidden causes behind raw sensory data: the objects, bodies, events, places, and situations that make the world predictable enough to act in. From this perspective, rather than experiencing a chaotic stream of light, we experience something far more simplified, stable, and actionable: a solid “chair” sitting in a three-dimensional “room.”
Importantly, that holistic experience is supported by countless coordinated predictions across many systems at once—visual predictions about the shape, color, depth, and lighting of the chair; bodily predictions about distance, posture, and possible movement; associative predictions about what chairs are for; and even linguistic predictions that allow the object to appear with the label “chair.”
We can then consider that, at the top of this hierarchy of abstract hidden causes sit some of our deepest priors—basic assumptions about objecthood, space, selfhood, continuity, and existence itself. So the intuitive, visceral feeling that the world is “out there” and undeniably real may partly reflect the brain’s functional role in helping the organism perceive, act, and regulate itself effectively.
We can apply the same logic to our perception of change. Our attention is constantly shifting, our bodies are in motion, and many of our environments are constantly changing; yet the world feels relatively stable. So, to operate effectively, the brain can curate what it notices, filtering out countless fluctuations and consciously registering the changes that matter.
We can see this in the distinction between changes in perception and the conscious perception of change: sensory input can change without us consciously noticing, and conversely, we can experience change even when relevant sensory signals remain stable.
Multiple studies have shown that when humans are distracted, major aspects of the environment can change without them noticing [1,2]. Our retinas may register the change, but because high-level processing is oriented towards a different goal, we may never consciously notice it.
Conversely, some optical illusions occur when fine details in our peripheral vision create sensory ambiguity [3,4]. To resolve the prediction error, the visual cortex predicts motion, resulting in the conscious hallucination of movement where none physically exists.
Moving on, our brain also constructs the felt structure of time—the sense of a past giving way to a present, which leads to a future. But given that the brain doesn’t have a centralized, objective “internal clock” ticking away in the background, how does it generate the feeling that time is passing?
Some researchers have proposed that our sense of duration is partly built from the accumulation of perceptual changes across our senses [5]. In one study, when volunteers watched various videos and estimated how long they lasted, they consistently overestimated the length of busy, chaotic scenes—like walking through a crowded city street—and underestimated quiet scenes—like sitting in an office.
Remarkably, separate fMRI research has shown that a human’s subjective experience of time can be accurately predicted just by looking at the activity in their visual cortex [6]. Overall, these suggest that time appears to be a best guess generated by how fast our perceptual models are updating.
Finally, this framework helps explain why the world does not appear to us as a neutral collection of physical properties. If perception is fundamentally bound up with action and bodily regulation at an unconscious level, then it reasonably follows that the world would consciously appear in terms of what it allows, invites, blocks, or threatens.
In the 1970s, ecological psychologist James J. Gibson coined the term affordances to describe the possibilities for action that emerge between an organism and its environment [7]. The point is not that we first perceive neutral objects and then assign uses to them. Rather, the world shows up already in terms of practical significance—a landscape of opportunities, obstacles, tools, threats, and possibilities.
Studies on affordances show that merely seeing graspable objects can prime motor responses [8], and further research suggests that objects within reachable space recruit the motor system more strongly than objects outside reach [9]. So when we look at a pen, its shape, size, location, and orientation are processed in relation to possible movements: reaching, grasping, writing. At a more survival-relevant level, the same logic applies when we look at the cliff edge: we perceive it as a hazard, a place where action must be carefully controlled.
Pushing this further, goal-oriented behavior appears deeply intertwined with conscious perception. Also emerging in the 1970s, Perceptual Control Theory argued that organisms often vary their behavior to keep perception within useful ranges [10].
Consider a baseball outfielder catching a high fly ball. The intuitive assumption is that the brain calculates the ball’s trajectory, figures out where it will land, and commands the body to run there. But research on catching suggests that fielders often solve this problem by controlling perception [11]. They move in a way to make the ball look a certain way—specifically, adjusting their running speed so that the ball appears to rise in a steady pattern against the background sky.
In this case, the fielder is not merely perceiving the ball in order to act; they are acting in order to keep perception within a useful pattern. Here, perception is a controlling hallucination just as much as it is a controlled one.
Affect and the Feeling of Objectivity
So, we have now considered how the brain’s generative model may construct certain structural aspects of our experience. But structure alone is not enough to keep an organism alive. It has to matter. It has to grip the body, guide attention, and motivate action. So now we need to ask the deeper question: how does perception come to feel so undeniably real, important, and objective?
Intuitively, we think of consciousness as a high-level cognitive process—the domain of the cerebral cortex, where language and abstract reasoning reside. However, a significant amount of research—particularly in Affective Neuroscience—suggests that conscious experience is deeply grounded in older affective systems: the brain’s ongoing evaluation of the body’s state.
In the 1990s, Jaak Panksepp mapped primal emotional systems onto ancient subcortical circuits shared across mammals [12]. Around the same time, Antonio Damasio emphasized a striking clinical pattern: large regions of the cortex can be damaged while basic wakeful awareness remains possible, whereas severe damage to the ancient hubs of arousal and homeostatic regulation can abolish consciousness altogether [13]. Some clinical reports even describe children born with little or no cerebral cortex who still showed wakefulness, emotional responsiveness, smiling, crying, and reactions to music [14]. Taken together, these findings suggest that affect and bodily regulation may play a foundational role in consciousness—especially in the basic felt sense of being alive.
Earlier, we defined valence as a low-dimensional regulatory heuristic: a basic estimate of whether the organism is doing “well” or “badly” relative to its own viability. As the human nervous system grew more complex, it built increasingly abstract models on top of this affective foundation, generating what we intuitively experience as emotions and mood.
From the controlled hallucination perspective, emotions and moods are generated from the inside out; joy, regret, anxiety, and countless other affective experiences can be understood as context-sensitive inferences about our biological state.
Crucially, in keeping with our broader discussion, these affective processes are control-oriented: they actively contribute to the regulation of our essential biological variables. When a car suddenly swerves into our lane, the instant jolt of panic we feel can be understood as part of a rapid predictive interpretation of the situation: the body is in danger, and immediate evasive action is required. That interpretation helps trigger a cascade of actions, such as releasing adrenaline, increasing heart rate, tensing the body, and ultimately slamming the brakes.
Antonio Damasio described an even subtler dynamic with his famous somatic marker hypothesis, arguing that the brain uses bodily signals to help guide higher-level decision-making [15,16]. When the brain simulates possible future actions, the body physically can react to those simulations before the conscious mind settles on a choice. According to Damasio, a tightening stomach, a racing heart, or gut feeling can act as “markers,” biasing cognition toward some options and away from others.
Now, we can consider how this affective influence shapes our broader experience and sense of reality. As the human brain evolved its capacity for complex conscious perception, newer cognitive networks evolved in constant interaction with older affective systems. As a result, we perceive the outside world through the body’s current state [17]. A hill can seem steeper when we are tired. A neutral face can feel more threatening when our blood sugar is low.
Taken to its furthest implication, some scientists argue that the brain’s guesses about the body’s internal state may help form our basic sense of reality. Psychologist Lisa Feldman Barrett describes this as affective realism: the idea that the brain uses affective valence to make a perceptual model feel real [18].
Basically, when the brain executes a predictive model and resolves sensory uncertainty without major conflict, affective valence can help coordinate the distributed network around that model—marking it as reliable, salient, and action-guiding.
This brings us back to our earlier discussion of conviction. It is plausible that a model weighted strongly enough to coordinate the body, guide attention, and organize action would manifest consciously as compelling—immediate, self-evident, and real. We can think of this as the brain unconsciously committing to the world being a certain way—resolving enough uncertainty for the organism to act without endlessly second-guessing itself.
Ironically, this profound “seeming to be real” may be exactly what keeps pulling us into dualistic thinking—making it difficult for our intuitions to accept that perception is not a transparent reflection of a mind-independent reality.
And centuries after Aristotle, Descartes, and Kant, we arrive at an implication that would likely have been unfathomable to all three: the very intuition that makes objective reality feel directly available to us may itself be just that—a feeling. Not proof of direct contact with reality, but the felt signature of predictive models operating with high confidence and minimal conflict.
Rationalization and Post-Hoc Narrative
What happens when our models inevitably fail?
Of course, when a model fails in a salient enough way, the uncertainty would plausibly feel a certain way—confusion, curiosity, anxiety, or dread—and that felt charge is what pulls the organism to resolve it.
As we’ve mentioned, from an active-inference perspective, resolving uncertainty can involve a form of exploration: we look again, move closer, ask a question, test the environment, change our behavior. Crucially, though, exploring the unknown does not always mean moving through the world; we can also explore uncertainty internally.
To navigate complex uncertainty, human cognition can shift into what some theorists describe as offline inference: temporarily decoupling generative models from immediate sensory input and motor output in order to simulate possible actions and futures [19]. We can intuitively connect this to what we subjectively experience as thinking, reasoning, or imagination.
Exploring uncertainty internally is less metabolically expensive, and far less dangerous, than physically testing possibilities. We can contemplate, “What happens if I jump off this cliff?” without actually doing it. As philosopher Karl Popper famously noted, thinking allows “our hypotheses to die in our stead.”
Perhaps unintuitively, one major utility of human consciousness may lie in its capacity to tolerate and navigate this space of non-committal uncertainty. We can hold multiple, conflicting possibilities at once, suspending physical commitment while the brain explores them internally. That uncertainty can then be resolved when particular possibilities become salient enough to guide action.
We saw earlier that affective valence may help perception feel real, salient, and action-guiding. Damasio’s somatic marker hypothesis gave us one version of this logic: bodily feelings can bias the brain toward some possible actions and away from others.
Given this framework, we should expect similar affective signals to play a role in more cognitively complex reasoning as well. As the brain resolves uncertainty internally, some possibilities begin to feel more settled than others. Cognitive psychology has studied this as the “Feeling of Rightness”: a metacognitive sense that a thought, answer, or inference is reliable enough to trust [20].
This research suggests that when humans solve probability, syllogistic, or spatial reasoning tasks, they often rely on a metacognitive affective signal: the felt fluency or ‘rightness’ of the reasoning process itself. If a thought feels fluent, the brain may treat that feeling as evidence that the answer is trustworthy—even when the logic has not actually been checked.
This helps make sense of many cognitive biases. Mental shortcuts, like the availability or recency heuristics, are computationally cheap and fast, which means they often process fluently. That fluency can generate the very felt trust that allows the conclusion to settle into place without further scrutiny. This dynamic suggests why we can be so blind to our own biases.
Related research also highlights that humans are often better at relative confidence than absolute confidence. In other words, we may be decent at sensing which of two answers feels more reliable, while still being poor at knowing whether either answer is actually correct. A wrong answer can feel smooth, obvious, and complete, even if its logic is flawed. The visceral feeling of being right does not necessarily guarantee that we are right.
Because the brain relies so heavily on fast, affective heuristics, we are vulnerable to a startling illusion about conscious thought. The processes that become available to us as deliberate reasoning are powerful, but relatively slow and metabolically expensive. Meanwhile, beneath awareness, the body and brain are constantly executing rapid, distributed predictions just to navigate the world in real time. So what we experience as conscious reasoning often arrives downstream—interpreting, organizing, and sometimes rationalizing processes that were already underway.
This dynamic was famously illustrated by neuroscientist Michael Gazzaniga’s split-brain experiments in the 1970s [21]. In patients whose left and right brain hemispheres had been surgically disconnected—dramatically limiting communication between the two sides—Gazzaniga found evidence for what he called the brain’s ‘interpreter’: a tendency to construct coherent explanations for actions whose causes may be unavailable to awareness.
In one classic experiment, researchers flashed the command “Walk” to the patient’s right hemisphere. The patient stood up and began walking. When asked why he had stood up, the language-dominant left hemisphere had to answer, even though it lacked access to the original command. We might expect the answer to be, “I don’t know”. Instead, the patient fabricated a coherent narrative: “I’m going to the kitchen to get a Coke.”
The unsettling implication is that when the body acts, consciousness can produce a plausible reason after the fact. And because that reason feels coherent and fluent, we may genuinely experience ourselves as the rational author of a thought whose causes began outside awareness.
In explicit reasoning, when someone experiences a high Feeling of Rightness but is asked to deliberate or explain their answer, reflective reasoning often gets recruited to justify the intuitive response [22]. This is where something like confirmation bias begins to make intuitive sense: a logical rationale can be built around a conclusion the system has already started to treat as settled. Conversely, a low Feeling of Rightness can help trigger more deliberate evaluation, making us more likely to question the first answer rather than defend it.
While these insights may seem disappointing to our intuition that we are the author of our thoughts, the larger takeaway is that linguistic rationalization still plays an essential role in conscious processing. By condensing a multifaceted field of cognitive activity into a single articulation—a ‘thought,’ a ‘reason,’ a ‘story’—language helps coordinate otherwise distributed networks and reduce internal conflict. Once again, even rational thought can be understood as part of the broader regulatory project: resolving uncertainty, coordinating action, and keeping our internal model coherent enough to move forward.
The Self and the Feeling of Freedom
Finally, this brings us to the fundamental topic of the Self. If conscious thought is not the origin of our choices, but an expression of deeper embodied processes, then the familiar picture of the “I” begins to look much less straightforward. Intuitively, we feel as though we are an immaterial soul looking out through our eyes, pulling the levers of the body, and authoring our choices. But to lean on Wittgenstein’s wisdom from earlier: how things seem is not necessarily how they are.
So who are we? And what kind of free will do we actually have?
Following the framework we’ve been building, Anil Seth argues that the feeling of ‘being you’ is not the entity doing the perceiving; the self is itself a perception—a high-level controlled hallucination.
On this view, the brain generates the feeling of being a persistent identity as a kind of biological center of gravity—a stable predictive heuristic that helps organize countless distributed processes around the model of a single, continuous organism.
To regulate the body effectively, the brain needs a relatively stable self-model. It needs some enduring reference point from which to evaluate bodily states, guide action, organize memory, and track well-being over time. In other words, our self-model carries enormous weight within the system.
This is why, despite the constant turnover of our physical bodies and mental states, we experience ourselves as persisting through time. Intuitively, if the self-model fluctuated as rapidly as our chemistry or passing thoughts, it would be much harder to coordinate action, memory, and regulation. We benefit from a sense of self stable enough to organize change, even though the system beneath it is always changing.
And notably, cases where the ordinary unity of self breaks apart reveal the self-model as an integration of separable processes. The narrative self can erode in severe amnesia or dementia. The sense of agency can falter in conditions like alien hand syndrome or schizophrenia, where people may lose the felt authorship of their own actions. The perspectival self can shift during out-of-body experiences or dissociative states. The bodily self can distort in phantom limb syndrome, where a missing limb is still felt, or in rare cases where a person may deny ownership of one of their own limbs.
These cases make it difficult to sustain the intuition of the self as a singular, indivisible soul. What we call the self begins to look more like an integrated, constantly maintained model of the organism.
If the self is itself a constructed perception, what does that mean for our choices? Why do we so viscerally feel that we are the ones freely directing our actions?
Like many aspects of conscious experience, we can think of the feeling of volition as a control-oriented perception emerging from distributed neural and bodily processes. From this perspective, predicting “I am causing this action” might help the brain distinguish sensory changes produced by the organism from sensory changes imposed by the environment.
Prospectively, intention can be understood as a kind of self-fulfilling prophecy. To act deliberately is, in part, to model a possible future state and give it enough motivational weight to organize the body toward it. The brain and body then work to bring sensory feedback into alignment with that predicted state, while conscious narrative helps frame the unfolding process as a choice: I decided to do this.
Crucially, this feeling of volition is not merely useful in the moment; it also helps shape future action. Recall our earlier discussion of learning: when an action produces a better-than-expected or more useful outcome, neuromodulatory systems such as dopamine can help update the circuits involved—including circuits involved in our narratives of agency, responsibility, and choice.
When things go wrong, we experience the visceral intuition: I could have done otherwise. But this feeling does not mean we could have stepped outside the laws of physics and acted differently in that exact moment. Rather, the feeling itself may be part of the learning process, helping the system update so that, in a future similar situation, we might actually act differently.
This challenges the idea of libertarian free will: the sense that, in the exact same moment, with the exact same body, brain, history, and world, the self could have somehow chosen otherwise. But it would be a mistake to call our sense of freedom a mere illusion. A conscious intention can be understood as a perceptual best guess—generated by the system, useful to the system, and as phenomenally real as any other aspect of experience.
From our perspective, there is no stepping outside of our controlled hallucination. But to find some practical utility from this discussion, we can still experience ourselves as able to reflect on our intuitions, revise our narratives, and deliberately shape how we engage with them. By clarifying our intentions, we place weight on certain predictions about the future, biasing the system toward particular patterns of attention, action, and regulation that may lead us toward those ends.
For example, if we reflect on the thought, “I am going to remain calm during this interview,” we are strengthening a high-level expectation. When the interview begins and our heart rate spikes, we are more likely to notice that bodily shift and respond to it—slowing our breathing, relaxing our posture, and bringing ourselves back toward the intended state.
Ultimately, we do not have the freedom to violate the laws of physics, and we cannot simply choose what we experience in each moment. But we can reflect on the narratives our rationalizing brain provides, reshape the intentions we give weight to, and gradually alter the expectations that influence what we experience and how we act in the future.
Chapter 10: Shared Understanding and Effective Explanation (The Epistemic Problem of Subjective Minds)
We must now confront a profoundly unsettling implication of everything we’ve discussed.
From the perspective we’ve been building, what each of us perceives is deeply subjective—shaped by genetics, prior experience, culture, attention, and even the immediate effects of what we ate for breakfast. Our standards of what feels “true” or “rational”, our visceral feelings of realness, and the explanations we give for our own beliefs all emerge from our individual predictive architectures.
And because those internal models are constantly updating through learning, memory, and other bodily changes, our perception of the world is always shifting—despite the brain’s remarkable ability to make experience feel stable and continuous.
This leads us to a serious philosophical problem: if each human being perceives through a unique, internally generated model of reality, how can we agree on anything? If all experience is, in some sense, a controlled hallucination, how do we ever arrive at shared truth?
Social Regulation and Pragmatically Similar Understandings
The answer begins with the fact that social species biologically regulate one another.
Every cognitive and affective process we have discussed so far has been tied to a basic biological problem: an organism must keep itself within viable bounds. For social species, this involves predicting, coordinating with, and being regulated by other organisms.
Over hundreds of thousands of years, hominid brains repeatedly encountered the same physical world—and, just as importantly, each other. From the brain’s perspective, another person is an extraordinarily complex and volatile source of sensory evidence. To reduce uncertainty and act effectively, the brain has to build predictive models of other people.
But unlike many physical objects, the hidden causes driving another person are deeply opaque. As Anil Seth notes, the light reflected from an object is closely tied to that object’s physical structure. But the sensory evidence of another person’s mental state arrives indirectly—through facial expressions, gestures, posture, tone, and speech. Each layer introduces ambiguity, leaving enormous room for uncertainty and misinterpretation.
To navigate this ambiguity, humans developed mechanisms for Theory of Mind: the cognitive ability to infer the hidden causes behind other people’s behavior—their emotions, beliefs, goals, and intentions. These inferred hidden causes affect both what we consciously perceive, and, crucially, how we act.
We can think about social interaction through the lens of Active Inference. Just as we move our bodies to bring sensory evidence closer to expected states, we act socially to shape other people’s responses in predictable ways. We smile not only to express our own pleasure, but to help regulate the mood of the interaction. When we speak, we are trying to bring another person’s expectations closer to our own: helping them notice what we notice, understand what we mean, and prepare to act in a coordinated way.
And just like with physical predictions, we test these social hypotheses against sensory evidence. We constantly monitor facial expressions, vocal tone, posture, and behavior, using that feedback to update our predictions and adjust what we think and do next.
Because humans are hypersocial, this interpersonal active inference becomes deeply reciprocal. My model of your mental state often includes a model of how you are modeling me. What do you think I am thinking? What do you think of me? How am I appearing to you?
Anil Seth highlights a profound implication here: the human sense of self may be deeply shaped by social context. If a brain never had to predict other minds, it may have far less need to model itself as an object in the minds of others. In this sense, part of what it means to experience oneself as a “self” may depend on the brain’s ability to model how it is perceived by other people.
Through this constant interpersonal collision of generative models, humans evolved alongside the very cultures they created. In evolutionary biology, this kind of process is often discussed through the Baldwin effect: learned behaviors and flexible adaptations can change the selective environment, making certain genetic dispositions more advantageous over time [1].
At the group level, this creates a powerful social feedback loop. When a group develops a more effective culture—better coordination in hunting, stronger social bonds, more effective teaching—that group gains a survival advantage. This success creates new selective pressures within the group: individuals better suited to learning, communicating, and contributing to the cultural system are more likely to thrive and reproduce. Over generations, our biology adapted to our increasingly effective culture, and our culture co-evolved with our biology.
Through this lens, we can see how communication, and eventually language, emerged as a biological mechanism for social regulation. Language allows isolated brains to coordinate their generative models so they can cooperate, regulate one another, and survive.
Karl Friston and cognitive psychologist Christopher Frith have explored how two predictive brains in interaction can form a coupled dynamic system [2]. Each agent tries to reduce uncertainty by predicting the other, while also acting in ways that make itself more predictable. Over time, mutual understanding becomes more efficient as both agents continually adjust and align their internal models.
But what does it actually mean to “align” our models? Because every human brain has a unique, constantly shifting architecture, two people can never hold the exact same understanding of a concept. Instead, through interaction and language, we arrive at pragmatically similar understandings: models similar enough to coordinate action, communicate meaning, and avoid constant social friction.
We can think about what “pragmatically similar” might mean in different contexts. Consider a student answering a question on a test. To get the question right, the student does not need to possess the teacher’s exact understanding. Their internal model only needs to be similar enough to produce the answer the teacher expects.
Intuitively, our shared understandings are layered and partially overlapping. We each belong to countless cultures—families, communities, professions, hobbies—and we maintain pragmatically similar understandings with different groups about thousands of different topics.
By dynamically interacting, updating our models, and minimizing social friction, we develop pragmatically shared knowledge. And because this alignment often happens so smoothly, it can generate the powerful metacognitive feeling that a single shared truth holds us together.
Reliable Knowledge Without an Objective Referee
This raises another question: how did human beings move from locally shared understandings to forms of shared knowledge that feel more authoritative—more reliable, more generalizable, and somehow more “right” than mere opinion or anecdote?
Today, much of our shared knowledge can feel self-evidently authoritative—scientific facts, moral intuitions, practical wisdom, cultural assumptions, and the background sense of what “reasonable people” believe. But a closer look at history suggests that this authority did not appear all at once; what we now treat as reliable knowledge emerged through the gradual development of cultural norms and practices for testing, correcting, and validating shared models of the world.
To understand how we got here, we can briefly walk through a compressed, necessarily selective history.
Shared knowledge has always been, at least in part, a practical achievement: a way of coordinating behavior in an uncertain world. A tool either worked or shattered, a plant nourished or poisoned, a social rule preserved trust or produced conflict. Over time, these practical lessons were compressed into shared habits, norms, stories, and traditions—ways of preserving what seemed to work across generations. In this broad sense, much of what we call “wisdom” emerged before explicit theory: as culturally inherited patterns for acting, perceiving, and evaluating what mattered.
Of course, this pragmatic inheritance was powerful but imperfect. A belief could survive because it helped a group coordinate, because it felt meaningful, because it was enforced by authority, or simply because it was never seriously challenged. At the same time, the practical need to predict and influence the world with greater precision kept pushing inquiry toward more specialized standards of evidence.
So the deeper historical question becomes: How did human beings develop practices for distinguishing models that merely felt authoritative from models that could more reliably predict the world and be tested, criticized, and corrected over time?
Ancient cultures across the world engaged in sophisticated observation to track the stars, manage agriculture, build calendars, and navigate their environments. But much of this inquiry remained embedded within mythic, religious, or spiritual frameworks. For centuries, the impulse to investigate the material world was often intertwined with the experience of the observer. Alchemy, for example, could be intensely empirical—developing glassware, distillation techniques, and careful methods for manipulating materials—while also interpreting material transformation through spiritual or symbolic meaning. The modern ideal of standardized, impersonal observation had not yet become the dominant model of inquiry.
A crucial development occurred during the Islamic Golden Age through thinkers like Ibn al-Haytham, who helped advance a more disciplined ideal of inquiry: observation should be structured, tested, and corrected rather than simply trusted at face value. Increasingly, the reliability of knowledge depended less on the observer’s personal experience or authority and more on the repeatable procedures used to investigate the world.
Over the next centuries, these ideas spread and transformed. During the European Scientific Revolution, thinkers like Francis Bacon, Descartes, and Galileo helped formalize different pieces of the emerging scientific worldview: disciplined observation, mathematical description, mechanical explanation, and the crucial separation between subjective experience and objective measurement.
But this was not a clean break. It was a messy, centuries-long process of intellectual negotiation. The very minds that forged these new rules were often still embedded in spiritual, theological, and mystical worldviews. Isaac Newton, for example, devoted enormous attention to alchemy, theology, and biblical chronology alongside his work in mathematics and physics [3]. But gradually, empirical inquiry carved out its own distinct domain of authority, achieving its immense power precisely by narrowing its scope to what could be measured, modeled, and publicly tested.
Crucially, this evolution of method gradually reshaped the cultural norms for what counted as “reasonable”. The growing emphasis on disciplined, publicly testable evidence began to seep into the broader culture, changing the very standards by which beliefs and arguments were judged. Over time, this shaped what could feel coherent, trustworthy, and intellectually satisfying. A claim unsupported by evidence increasingly struggled to produce the same feeling of certainty.
In the modern era, this ethos has been institutionalized into a global infrastructure designed to help us detect, correct, and minimize shared errors. Peer review gradually emerged as a major social mechanism for error-checking. Falsifiability offered an influential ideal for keeping models exposed to possible failure. And Statistics gave us a formal mathematical language for managing uncertainty, allowing us to quantify degrees of confidence—not “this is absolutely true,” but “here is the evidence, and here is how strongly this conclusion is supported.”
Science, therefore, is not a flawless, timeless window into an independent reality. It is a historically developed, highly specialized set of cultural norms and tools for reducing uncertainty about our shared understanding of the world. And ultimately, these very norms have led us to the robust yet sophisticated models of physics and neuroscience that have fueled this entire conversation.
The Paradox of Explaining Perception
We must now confront the ultimate paradox of this entire essay—and, in a sense, the challenge of explaining perception in general. We have just used empirical science to challenge the intuition that perception gives us direct, unmediated access to the physical world. But the scientific concepts we used to make that argument—”brain”, “body”, “stimulus”, “model”—are themselves contents of human experience.
These concepts reflect our pragmatically shared understandings, made possible by overlapping education, culture, and language. Together, they give rise to the concept of a “physical world” that feels entirely independent of our experience—and, at the same time, responsible for causing it. We are using the brain’s generative model to explain the mechanics of its own generative model.
Many might be tempted to view this recursion as a fatal flaw that disqualifies the entire scientific framework. But if we take the success of empirical science seriously, we should understand this loop not as a failure, but as an inherent boundary condition of human knowledge. We cannot step outside our own perspective and compare our models against reality “as it is in itself”; there is no absolute, perspective-free judge who can finally certify that our explanation is right.
Ultimately, no assertion about reality escapes the conditions of human experience. The strict physicalist, the idealist, and the dualist all argue from within perspectives shaped by their own priors, intuitions, and standards of explanation. When they argue over the “true” metaphysical nature of reality, they are unlikely to settle the debate by discovering one final, perspective-free fact or reason that ends interpretation once and for all. They clash, in part, because each framework offers an interpretation of reality that can feel compelling, coherent, and complete.
A key lesson of this paradox is a radical re-evaluation—and renewed appreciation—of the role that communication plays in explanation. Because we cannot step outside our individual models, we can understand communication not as the simple transfer of universal facts from one brain to another, but as the pragmatic process of aligning subjective understandings.
This means that our choice of language matters deeply. The words we use act as distinct sensory stimuli for other people, shaping the cascade of inferences within each listener’s generative model. When we attempt to explain perception, a central goal is to develop an explanation that brings our understandings as close together as possible. The explanation has to meet the listener where they are: clear enough to keep them oriented, precise enough to preserve the idea’s nuance, and substantive enough to feel genuinely illuminating.
But this leaves us stranded right back in our fundamental paradox. The framework we have discussed today feels incredibly compelling to me, but what justifies it over any other philosophical stance? How can we determine which explanations are more or less appropriate than others?
If we evaluate an explanation by relying only on what feels reliable, we remain vulnerable to our affective biases. If we trust our immediate biological priors too strongly, we may be pulled back toward a form of Direct Realism, because it intuitively feels like we are looking out through a transparent window. Conversely, if we over-index on the seemingly self-evident nature of consciousness, while forgetting that even our conceptual thinking about “consciousness” is itself shaped by our generative model, we may slide toward solipsistic thinking—treating the mind as the only undeniable reality and the physical world as a mere illusion. Both extremes fall prey to the subjective trap we find ourselves in.
So, as I sit here with my own subjective perspective, I need something more robust to point me in a direction. I need some heuristic within my own understanding that I can notice and follow.
As you may have guessed, this is where we can return to our earlier discussion of cultural norms: the standards developed over centuries to help us reduce uncertainty about our experience of the world. Particularly, the norms of empirical science and logic are the most reliable collective mechanism we possess; they have allowed us to split the atom, eradicate diseases, build global communication networks, and—most relevant to our journey today—connect patterns of brain and body activity to aspects of conscious experience.
To be clear, we cannot treat the products of empirical science as absolute dogma—self-evident truths innately written into the fabric of the universe. Above all else, what we can champion are the normative standards that brought those products into existence. These very standards may eventually render our current understanding of physics, neuroscience, and perception as incomplete and needing to be updated. They may even force us to revise the standards themselves, as we continue to discover more effective ways to evaluate observation and evidence.
Collectively, these norms impose an ongoing responsibility: a methodology that must be continually interpreted, practiced, and updated. They demand that we provide evidence, show our work, and remain radically open to correction—ensuring that even these very standards remain vulnerable to revision. In return, they provide shared grounding. Designed to expose affective bias and reduce collective error, they may be the most rigorous heuristic we have for navigating the uncertainty inherent in subjective experience.
Ultimately, what the norms of empirical science and logic really give us are rules for understanding and communication. They provide cultural boundaries for how we ought to think and talk about perception—constraining what we can appropriately claim as true, and in a deeper sense, what we are justified in believing.
So, finally, I propose this: we ought to respect the explanations that are most in line with the standards of empirical science and logic. Not because these explanations map perfectly onto some transcendent, absolute “Truth”—once again, we can never fully evade the subjectivity of our own perception, nor the ambiguity of our language. Rather, we can choose to respect these explanations because they are the most effective pragmatic tools we possess for reducing ambiguity and finding shared footing.
To fully embrace this framework, I must explicitly acknowledge that what I am doing in this essay is proposing how we ought to think and talk about perception. I am not making an assertion about the fundamental nature of perception or the world we consciously feel we perceive. My argument comes from my own subjective understanding, yet I have attempted to justify it using rigorous, comprehensive reasoning and careful communication.
And in a strange way, the beauty of this explanation is that it does not pretend to escape the recursion at the heart of this essay. It acknowledges the paradox inherent in using culturally shared concepts to explain perception, and it embraces that loop by relying on the tools of shared language and science to explain why that paradox exists—and how we can navigate it.
Ultimately, this essay is simply my attempt at an accessible, intuitive, and scientifically grounded explanation of how we perceive the world.
A Pragmatic, Optimistic Conclusion
Alright, let’s close this out with a pragmatic, optimistic conclusion.
For centuries, humans have been entranced by the search for absolute ontological certainty—trying to determine what is objectively real outside of experience.
With this essay, I am trying to explain why that pursuit may be misguided. Contemporary neuroscience challenges the transparent-window view of reality, suggesting that perception is better understood as an active, biologically practical construction—shaped from within, constrained by the environment, and organized around the needs of a living body. We cannot attain absolute, mind-independent certainty from outside our own experience.
The journey to this realization was long and demanding. But crucially, this deeply unintuitive conclusion would have felt inadequate had we not walked through the history, science, and philosophy that made it necessary.
So where does this leave us?
I believe a truly modern philosophy should not be the search for a final, optimal explanation. Instead, we can embrace philosophy as an ongoing, dynamic process of developing our explanations: leveraging our current intuitions while grounding them in the evolving standards and discoveries of empirical science and logic.
We will never arrive at a final, “optimal” explanation. Because our collective knowledge, culture, language, and scientific tools will constantly change, we must continually update our explanations.
I hope you enjoyed this inherently imperfect explanation. We will continue to learn and develop these ideas on this channel, so make sure to subscribe and follow to join in on the fun. A massive shoutout to all the viewers who helped motivate and develop this script; a profound thank you to Anil Seth and the other references whose work contributed to this explanation.
Enjoy the rest of your day.
References and Further Reading
This essay was significantly directed/sourced by the following:
Seth, Being You (2021)
Hohwy, The Self-Evidencing Agent (2026)
Barrett, Seven and a Half Lessons About the Brain (2020)
Damasio, The Strange Order of Things (2017)
Chapter 1
SEP on Aristotle’s Psychology
Empiricus, Annas, & Barnes, Outlines of Scepticism (2000)
Ibn al-Haytham, Book of Optics [Further Reading]
Kepler, Ad Vitellionem Paralipomena (1604)
Galileo, Sidereus Nuncius (1610) [Further Reading]
Hooke, Micrographia (1665)
Galileo, The Assayer (1623) [Further Reading]
Chapter 2
Descartes, Meditations on First Philosophy (1641) [Further Reading]
Locke, Essay Concerning Human Understanding, Book II, Ch. VIII. (1690) [Further Reading]
Elisabeth of Bohemia–Descartes Correspondence (1643) [Further Reading]
Berkeley, Principles of Human Knowledge (1710)
Hume, A Treatise of Human Nature (1739)
Chapter 3
Kant, Critique of Pure Reason (1781/1787)
SEP on Phenomenology
Chapter 4
Helmholtz, Treatise on Physiological Optics (1860s/1867) [Further Reading]
SEP on Hermann von Helmholtz
Oxenham, Helmholtz (2011)
Chapter 5
Ulric Neisser, Cognitive Psychology (1967)
David Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (1982) [PDF]
Crump, Information Processing (2021)
Hubel & Wiesel, The Journal of Physiology (1962)
Grill-Spector & Malach, “The Human Visual Cortex” (2004)
Bexton, Heron, & Scott, Canadian Journal of Psychology (1954)
Rumelhart, McClelland & the PDP Research Group, Parallel Distributed Processing (1986)
Douglas, Martin, & Whitteridge, “A Canonical Microcircuit for Neocortex” (1989).
Douglas, Koch, Mahowald, Martin & Suarez, “A Functional Microcircuit For Cat Visual Cortex” (1991)
Felleman & Van Essen, Cerebral Cortex (1991) [Further Reading]
Sherman & Guillery, “Distinguishing Drivers from Modulators” (1998) [Further Reading]
Thorpe, Fize, & Marlot, Nature (1996)
Chapter 6
Gregory, “Perceptions as Hypotheses” (1980)
Dayan et al., Neural Computation (1995)
Kersten, Mamassian & Yuille, Annual Review of Psychology (2004)
Mumford, Biological Cybernetics (1992)
Rao & Ballard, Nature Neuroscience (1999) [PDF][Further Reading]
Smith & Muckli, PNAS (2010)
Muckli et al., “Contextual Feedback to Superficial Layers of V1” (2015)
Chapter 7
Friston, “A Theory of Cortical Responses” (2005) [Further Reading]
Alink, Schwiedrzik, Kohler, Singer, & Muckli, “Stimulus Predictability Reduces Responses in Primary Visual Cortex” (2010). [Further Reading]
Pinto, Y., van Gaal, S., de Lange, F. P., et al., Journal of Vision (2015)
de Lange, F. P., Heilbron, M., & Kok, P., Trends in Cognitive Sciences (2018) [Further Reading]
S. Dehaene, M. Kerszberg, & J. Changeux, “A Neuronal Model of a Global Workspace In Effortful Cognitive Tasks” (1998) [Further Reading]
Seth, Being You (2021)
Barrett, Seven and a Half Lessons About the Brain (2020)
Clark, “Whatever Next?” (2013)
Chapter 8
Friston, “A Free Energy Principle For The Brain” (2006) [Further Reading]
Friston, Thornton, & Clark, “Free-Energy Minimization and the Dark-Room Problem” (2012)
Joffily & Coricelli, “Emotional Valence and the Free-Energy Principle” (2013)
Hesp et. al, Deeply Felt Affect (2021)
Sterling, “Allostasis: A Model of Predictive Regulation” (2012) [Further Reading]
Barrett, How Emotions Are Made (2017) [Further Reading]
Friston et al., “Active Inference and Learning” (2016)
Friston et. al, “Active Inference: A Process Theory” (2017)
Pezzulo et. al, “Active Inference, Epistemic Value, and Vicarious Trial and Error” (2016)
Sánchez-Cañizares, Entropy (2021)
Chapter 9
Simons & Chabris, Perception (1999)
Rensink et. al, “To See or not to See” (1997)
Fraser & Wilcox, Nature (1979)
Inagaki & Usui, “Visualization and Analysis of Peripheral Drift Illusion” (2011)
Roseboom et al., Nature Communications (2019)
Sherman et al., PLOS Computational Biology (2022)
Gibson, “The Theory of Affordances” (1977)
Tucker & Ellis, Journal of Experimental Psychology (1998)
Cardellichio, Sinigaglia, & Costantini, “The Space of Affordances: A TMS Study” (2011)
Powers, Behavior: The Control of Perception (1973)
McBeath, Shaffer, & Kaiser, Science (1995)
Panksepp, Affective Neuroscience (1998) [Further Reading]
Parvizi & Damasio, Cognition (2001)
Shewmon, Holmes, & Byrne, Developmental Medicine & Child Neurology (1999) [Further Reading]
Damasio, Descartes’ Error (1994)
Damasio, Philosophical Transactions: Biological Sciences (1996)
Siegel et al., “Seeing What You Feel” (2018)
Barrett, How Emotions Are Made (2017)
Pezzulo et al., Learning and Memory (2016)
Thompson, Prowse Turner, & Pennycook, Cognitive Psychology (2011) [Further Reading]
Gazzaniga, The Ethical Brain (2005)
Kunda, “The Case for Motivated Reasoning” (1990)
Chapter 10
G. G. Simpson, “The Baldwin Effect” (1953)
Friston & Frith, Cortex (2015)
Stephen D. Snobelen, Early Science and Medicine (2021)
