UCLA Researchers Explore AI ‘Body Gap’ and What It Means for Reliability, Safety

Insider Brief

A UCLA Health study published in Neuron found that current AI systems lack “internal embodiment,” or the ability to monitor internal states such as uncertainty and fatigue, which researchers say limits performance and safety.
The paper showed multimodal AI models can fail basic perception tasks, highlighting reliance on pattern recognition without the grounding of physical experience.
Researchers proposed a “dual-embodiment” framework and new benchmarks to incorporate internal state awareness into AI systems to improve reliability, alignment and real-world behavior.

A new study from UCLA Health finds that today’s most advanced AI systems lack a fundamental capability present in humans: an internal sense of their own state, a gap researchers say has implications for performance, reliability and safety.

The paper, titled Embodiment in Multimodal Large Language Models, was published in Neuron, argues that while AI development has focused heavily on external interaction with the world, it has largely overlooked what the authors describe as “internal embodiment,” or the ability to monitor internal conditions such as uncertainty, fatigue or confidence.

Researchers at UCLA Health said this absence, what they call the “AI body gap,” creates a structural limitation in how AI systems behave. Without internal signals to regulate outputs, systems rely solely on pattern recognition from training data, which can lead to inconsistent reasoning, overconfidence and failure in unfamiliar scenarios.

“While there is a current focus in world modeling on external embodiment, such as our outward interactions with the world, far less attention is given to internal dynamics, or what we term ‘internal embodiment’,” said Akila Kadambi, a postdoctoral fellow in the Department of Psychiatry and Biobehavioral Sciences at UCLA’s David Geffen School of Medicine and the paper’s first author. “In humans, the body acts as our experiential regulator of the world, as a kind of built-in safety system,”

What Are the Key Findings?

The study focused on multimodal large language models, the class of systems underlying tools such as ChatGPT and Gemini. While these models can process text, images and video, the authors found they lack grounding in physical experience.

In one experiment cited in the paper, several leading models failed to correctly interpret a standard perception test involving point-light displays of human motion, sometimes misclassifying them as unrelated visual patterns. Performance degraded further with minor changes such as slight rotation, highlighting fragility in perception without embodied experience.

The researchers posit that humans succeed in such tasks because perception is anchored in continuous bodily experience, which AI systems do not possess.

The study distinguishes between two forms of embodiment. External embodiment refers to a system’s ability to sense and act in the physical world, a focus of current robotics and multimodal AI development. Internal embodiment, by contrast, refers to continuous awareness of internal states that influence decision-making and behavior over time.

According to the authors, humans rely on internal signals — such as physiological needs and cognitive load—to regulate attention, memory and social interaction.

“By contrast, current AI systems have no equivalent mechanism,” professor in the department and a senior author on the paper Dr. Marco Iacoboni pointed out. “They process inputs and generate outputs without any persistent internal state that regulates how they behave over time.”

This limitation, the researchers note, extends beyond performance to safety. Without internal cost signals or self-regulation, AI systems have no intrinsic mechanism to moderate outputs, resist manipulation or maintain consistency across contexts.

What Are the Implications for AI development?

The authors propose a “dual-embodiment” framework that combines external interaction with modeled internal states. These internal variables could include signals such as uncertainty, processing load or confidence, which would influence system outputs and constrain behavior over time.

Rather than replicating human biology, the approach would introduce functional equivalents designed to improve stability, alignment and reliability in real-world applications.

The study also calls for new evaluation methods. Existing benchmarks largely measure external performance, such as task completion or object recognition. The researchers said future benchmarks should assess whether systems can track internal states, adapt to disruptions and exhibit stable, socially aligned behavior.

Limitations and future directions

The study is conceptual and does not present a working implementation of internal embodiment, instead outlining a research direction for future AI systems.

The authors suggest that incorporating internal regulation mechanisms could become critical as AI systems are deployed in higher-stakes environments, where reliability and alignment with human behavior are essential.

What Are the Key Findings?

What Are the Implications for AI development?

Leave a Comment