Dwarkesh Podcast · May 13, 2026

Adam Marblestone — AI is missing something fundamental about the brain

AI generated article / en / study

What you will learn

Adam Marblestone — AI is missing something fundamental about the brain In this wide...
The central thesis is that evolution encoded immense complexity into the "steering...
The stakes are both scientific and existential—understanding how the genome encodes...

Best for

Readers looking for surprising ideas from global podcasts they may not find on their own.

Source podcast

Dwarkesh Podcast / Dwarkesh Patel

Key points

The brain's sample efficiency likely comes from evolution encoding complex, domain-specific reward functions and cost functions into the genome, not just from better architecture or learning algorithms—a compact encoding because reward functions can be specified in a few lines of "Python code."
The cortex may implement "omnidirectional inference"—the ability to predict any subset of variables from any other subset—rather than the unidirectional next-token prediction of LLMs, which is closer to energy-based models.
Steve Byrnes's theory of a "learning subsystem" (cortex) and "steering subsystem" (innate reward circuits) explains how evolution can wire up abstract learned concepts (like "Yann LeCun") to innate responses (like shame): the cortex learns to predict the steering subsystem's responses, gaining generalization for free.
The steering subsystem contains far more diverse cell types than the cortex, consistent with the idea that evolution invests genomic real estate in specifying many bespoke reward functions rather than in architectural complexity.
The brain uses multiple RL systems simultaneously: model-free RL in the basal ganglia with value functions (dopamine reward prediction errors), and model-based planning in the cortex that predicts reward outcomes.
A complete mouse brain connectome could cost low tens of millions of dollars with next-generation optical microscopy, and understanding the steering subsystem across several mammal species is achievable for hundreds of millions to low billions—a small investment relative to GPU spending.
Formal mathematics (Lean) creates a perfect RLVR signal for AI, and automating proof search is likely to work, but automating conjecture and mathematical creativity requires solving the harder problem of defining a loss function for "good explanations."
The "Gap Map" reveals that many fields need scalable infrastructure (like connectomics or formal verification) rather than new intellectual breakthroughs, and these gaps can be filled for surprisingly modest investment.

Listen & ask in the app

Listen, ask questions, generate in your language

More episodes

Adam Marblestone — AI is missing something fundamental about the brain

In this wide-ranging conversation, host Dwarkesh and Adam Marblestone—CEO of Convergent Research and former research scientist at Google DeepMind's neuroscience team—argue that the AI field has systematically underestimated the importance of the brain's reward functions and loss functions, focusing instead on architecture and learning algorithms. The central thesis is that evolution encoded immense complexity into the "steering subsystem" of the brain: a set of innate reward functions, cost functions, and bootstrapping mechanisms that guide learning with remarkable efficiency. The stakes are both scientific and existential—understanding how the genome encodes abstract desires could unlock dramatically more sample-efficient AI and address fundamental alignment challenges. The conversation moves from high-level theory to concrete neuroscience, connectomics, formal mathematics, and the future of scientific infrastructure, with Marblestone offering a distinctive perspective that bridges AI, neuroscience, and moonshot research organization.

0:00The Brain's Secret Sauce: Reward Functions Over Architecture

Marblestone opens by framing what he calls "the quadrillion dollar question": how does the brain learn so much from so little data compared to large language models? His meta-level take is that the field of neuroscience itself needs to be made more powerful technologically before this question can be cracked. But within the framework of modern deep learning—architecture, learning algorithm, initialization, and cost functions—his personal hunch is that the field has "neglected the role of this very specific loss functions, very specific cost functions."

Machine learning typically uses mathematically simple loss functions like next-token prediction or cross-entropy. Marblestone argues that evolution may have built immense complexity into the loss functions themselves: "many different loss functions in different areas turned on at different stages of development. A lot of Python code, basically generating a specific curriculum for what different parts of the brain need to learn." Because evolution has seen countless iterations of what succeeds and fails, it can encode knowledge of the learning curriculum directly into the reward structure.

This perspective reframes a central mystery. The human genome is only about 3 gigabytes, with a tiny fraction relevant to building the brain. If evolution's contribution were primarily architectural hyperparameters—how many layers, how things are wired—it's hard to see how so little information could produce such powerful learning. But if a major component is the reward function, that changes everything: "the reward function in Python, the reward function is literally a line. And so you just have a thousand lines like this, and that doesn't take up that much space."

12:34The Cortex as Omnidirectional Inference Engine

Marblestone proposes that the cortex may be doing something fundamentally different from current LLMs: not just next-token prediction, but "omnidirectional inference" or "omnidirectional prediction." Any area of cortex can learn to predict any subset of its inputs from any other subset. This is closer to energy-based models as championed by Yann LeCun—models that learn the joint distribution of all variables, allowing you to "clamp" any arbitrary subset and infer the rest.

This contrasts sharply with LLMs, which compute a very particular conditional probability: given the last thousands of tokens, what is the probability distribution for the next token? While LLMs can learn to fill in the middle through emergent in-context learning, they are not natively designed for arbitrary inference patterns. The cortex, by contrast, may be natively built so that any area can predict any pattern in any subset of its inputs given any other missing subset.

Marblestone connects this to the brain's relationship with the body: the cortex is "basically riding on top of a lizard brain and lizard body." The cortex learns to predict not just external sensory data but also internal states—whether a muscle is about to tense, whether a reflex is about to trigger, whether heart rate is about to increase. This predictive capacity becomes the foundation for more complex learning.

22:20Amortized Inference and What the Genome Actually Stores

The conversation turns to amortized inference—the idea that instead of performing expensive Bayesian inference at test time (sampling possible causes, checking which explains observations), a neural network learns a direct mapping from observation to cause. This is what feedforward networks do: they "amortize" the inference process into a single forward pass.

Marblestone notes that there's a fundamental tension here. Some probabilistic AI researchers argue that real inference inherently requires test-time compute—that amortization is just "a crappy approximation that's never going to" work without infinite data. They might point to the brain's stochastic neurons as evidence that the brain does real, non-amortized inference. Yet perception works in milliseconds, suggesting some baking-in of approximations.

The key insight for evolution is that it faces severe bandwidth constraints: everything must pass through the genome. If the learning subsystem (cortex) has relatively little pre-initialization and learns within a lifetime, then evolution didn't amortize much into that network. Instead, it amortized into "a set of innate behaviors and a set of these bootstrapping cost functions or ways of building up very particular reward signals." This is a much more compact encoding—specifying the learning algorithm and basic architecture once, then writing many small "Python code" snippets for specific reward functions.

Marblestone cites evidence from single-cell atlases: there are far more diverse and bespoke cell types in the steering subsystem (hypothalamus, brainstem) than in the learning subsystem (cortex). The cortex has relatively uniform cell types—enough to build a learning algorithm and specify some hyperparameters. The steering subsystem has "a gazillion really weird cells, which might be the one for the spider flinch reflex and the one for 'I'm about to taste salt in the water.'" Each reward function may need its own cell type because it requires innately wired circuits, with specific receptors and proteins that guide genetic wiring without learning.

42:42Model-Based vs Model-Free RL in the Brain

Marblestone addresses the striking difference between how LLMs are trained and how the brain uses reinforcement learning. Current LLMs use a remarkably simple form of RL: if a trajectory solves a problem, every token in that trajectory is upweighted. There are no value functions—no mechanism for estimating the long-run consequences of actions. As Ilya Sutskever noted on a previous episode, "it's weird that you don't use value functions... it's the most dumbest form of RL."

The brain, by contrast, has multiple RL systems. Parts of the basal ganglia and striatum appear to do something very much like model-free RL with a finite action space—telling the spinal cord to execute motor actions, or allowing certain parts of the cortex to communicate. Dopamine neurons provide a reward prediction error signal, consistent with temporal difference learning and value functions.

On top of this, the cortex builds a world model that can include predictions about when rewards occur. Since the cortex is already predicting what the steering subsystem will do, it can learn a model of reward: "these types of plans, these types of actions will lead in these types of circumstances to reward." This enables model-based planning. Marblestone also notes the idea of "RL as inference"—if you have a general model that includes plans and rewards, you can clamp the reward as high and infer what plan would lead there, effectively getting planning for free.

The conversation touches on a fascinating parallel: human cultural evolution may itself be a form of model-free RL operating at civilizational scale. How do societies discover that a poisonous bean becomes edible through a ten-step process, or how to hunt a seal with specific weapons at specific times? Through trial and error over generations—the simplest algorithm, applied at scale.

50:31Biological Hardware: Limitation or Advantage?

Marblestone evaluates the tradeoffs between biological and digital hardware. The obvious downside of the brain is that it cannot be copied and lacks external read-write access to every neuron and synapse. But it has significant advantages: extreme energy efficiency (20 watts), co-location of memory and compute, and the ability to run on slow, low-voltage switches (around 200Hz).

One intriguing possibility is that the brain's stochasticity is not a bug but a feature. If the brain is doing energy-based models that require sampling, neurons are naturally stochastic—they "just generate samples" without needing a random number generator and Python code. This could represent a deep co-design between hardware and inference method.

Marblestone pushes back against the idea that cellular-level complexity beyond synaptic changes is doing fundamentally new algorithmic work. Most of the molecular machinery inside neurons, he argues, is just implementing what a digital computer does trivially: finding weights and changing them. "It just literally has to do all this with molecular machines itself, without any central controller." There are exceptions—cerebellar cells that store time delays in their cell bodies rather than in synaptic chains—but the dominant algorithmic substrate remains changes in connections between neurons.

1:03:59Why a Map of the Human Brain Matters

The conversation shifts to connectomics—mapping every neuron and synapse in a brain. Marblestone argues that a connectome alone won't directly reveal the learning algorithm, but it provides massive constraints that can help discriminate between competing theories. Even basic information—like that there are more cell types in the hypothalamus than in the cortex—tells us something profound about where structure is built.

The current state of the art is sobering. The Wellcome Trust estimated that the first complete mouse brain connectome would cost several billion dollars. Project E11 (a Convergent Research spinout) aims to get that down to low tens of millions by switching from electron microscopy to optical microscopy, which allows "molecularly annotated connectomes"—not just who connects to whom, but what molecules are present at each synapse.

For a human brain, which is about 1,000 times larger, naive scaling would still mean billions of dollars. But Marblestone argues that we don't need every neuron in a human brain—we need an entire mouse brain, a human steering subsystem, and the brains of several mammals with different social instincts. This could be done for "hundreds of millions to low billions" with concerted effort.

He draws a parallel to the Human Genome Project: the first genome cost roughly $3 billion ($1 per base pair), but the National Human Genome Research Institute structured funding to drive competition, and costs dropped a million-fold in a decade through technological paradigm shifts. Similar dynamics could apply to connectomics.

Marblestone also discusses the idea of training AI models on brain activity data—using neural activity patterns as auxiliary prediction targets. If every iPhone were also a brain scanner, we could train models that not only predict labels but also predict how the brain represents those labels, potentially yielding better generalization and robustness. This is "brain data augmented" training, somewhere between whole-brain emulation and standard supervised learning.

1:23:28What Value Will Automating Math Have?

Marblestone, who serves on the board of the Lean formal mathematics project, discusses how AI is transforming mathematics. Lean is a programming language for expressing mathematical proofs in a mechanically verifiable form—click "verify" and it checks whether conclusions follow from assumptions. This creates a perfect RLVR (reinforcement learning from verifiable rewards) signal: a correct proof is objectively correct.

He predicts that "rlvring the crap out of math proving is basically going to work"—we'll have systems that search for proofs the way AlphaGo searches for winning Go moves. This is already happening: Harmonic (a billion-dollar valuation company) and AlphaProof are based on this approach.

But this solves only part of mathematics. The harder problems are conjecturing new interesting theorems, organizing mathematical concepts, and developing high-level proof strategies. Marblestone raises a profound question: "Is there a loss function for good explanations or good conjectures?" A powerful theorem might be one that compresses information—a short statement from which many other theorems follow with short proofs. This is essentially the problem of automating mathematical creativity.

The applications extend beyond pure math. Formally verified software—code that is mathematically proven to be unhackable or to have specific security properties—could transform cybersecurity. The bottleneck is the "specification problem": engineers don't know how to write formal specifications of what their code should do. But as LLMs improve at generating verifiable proofs, this could flip the entire field of formal methods from academic backwater to mainstream practice.

Marblestone speculates about a future where "quantity has a quality all of its own"—where automated cleverness combined with formal verification allows billions of AI systems to collaborate, proving each step of their reasoning. This could enable "explicitly interpretable world models" specified in equations rather than neural activations, moving back toward symbolic methods simply because we have insane proving ability.

1:38:18Architecture of the Brain

The conversation concludes with open questions about brain architecture. How does the brain represent its world model? Is it like the hidden state of a neural network, or closer to a symbolic language? Marblestone's hunch is "it's going to be a huge mess"—we should focus on architecture, loss functions, and learning rules rather than expecting clean symbolic representations.

On consciousness ("what is up with feeling like there's an experience"), Marblestone is candid: "I'm pretty much at a loss on this one. I don't think anybody has any idea. It might even involve new physics."

On continual learning, he notes that the hippocampus likely plays a key role, storing sequences and replaying them to train the cortex. There may be multiple timescales of plasticity—synapses with many states, fast weights for short-term information alongside slower plasticity for long-term learning.

The conversation ends with the "Gap Map"—Convergent Research's catalog of fundamental scientific capabilities that are missing, each representing a "mini Hubble Space Telescope" that would unblock entire fields. Marblestone was surprised to find that formal math verification (Lean) was one of these gaps—he started from neuroscience and biology, but discovered that even mathematicians need infrastructure beyond whiteboards. The overall picture: a few hundred fundamental capabilities, each at deep-tech-startup scale, totaling only a few billion dollars to solve.

Conclusion

This episode matters because it offers a coherent alternative to the dominant scaling paradigm in AI. Marblestone synthesizes insights from neuroscience, evolutionary biology, and AI to argue that the brain's efficiency comes not from a better learning algorithm or architecture alone, but from exquisitely engineered reward functions that guide learning with minimal data. The conversation is notable for its intellectual humility—Marblestone repeatedly emphasizes that we don't know the answers, and that the field needs better tools (connectomes, molecular annotations, formal verification) to find them. Yet within that humility, he offers a compelling framework: the learning subsystem (cortex) and steering subsystem (innate reward functions) working in tandem, with the cortex learning to predict the steering subsystem and thereby generalizing innate responses to novel situations. This framework provides an answer to Ilya Sutskever's question about how the genome encodes abstract desires, and it has direct implications for AI alignment: a paperclip maximizer might be possible precisely because the minimum viable steering subsystem for intelligence is far simpler than the rich set of human social instincts.

Adam Marblestone — AI is missing something fundamental about the brain

Adam Marblestone — AI is missing something fundamental about the brain

0:00The Brain's Secret Sauce: Reward Functions Over Architecture

12:34The Cortex as Omnidirectional Inference Engine

22:20Amortized Inference and What the Genome Actually Stores

42:42Model-Based vs Model-Free RL in the Brain

50:31Biological Hardware: Limitation or Advantage?

1:03:59Why a Map of the Human Brain Matters

1:23:28What Value Will Automating Math Have?

1:38:18Architecture of the Brain

Conclusion

Read next