Building Innate Knowledge into Modality-Agnostic AI Systems

Insights
December 3, 2024
August 21, 2024
Aakash Nain

Marc Pickett

A new paper by researchers from Emergence, Openmind Research Institute, and Sakana AI explores how to encode specific innate knowledge into AI systems that are agnostic to the sensory modality of their inputs. This work provides a proof-of-concept for how intelligent systems could have built-in "instincts" that guide their behavior, even when the exact form of their sensory inputs is unknown at design time.

Introduction

How can an AI system have hard-coded innate knowledge about the world, while still being highly adaptable to different sensory modalities and environments? Drawing inspiration from biological systems like beavers, which have an innate drive to build dams even when raised in human captivity, the authors of this paper ask: how can we build in specific desired concepts in an AI system where we don't know how a given stimulus will be grounded?

The paper introduces a formal version of this "Ungrounded Alignment Problem" and demonstrates a solution for a simplified case. Though the initial setup is simple, it provides an existence proof for how selected innate knowledge could in principle be embedded into highly plastic AI systems. This has implications for building AI with specific drives or goals, without sacrificing flexibility.

The Ungrounded Alignment Problem asks: How can we build in specific desired concepts in an AI system where we don't know how a given stimulus will be grounded?

Key ideas:

  • An AI system should be able to reliably detect specific abstract patterns (e.g. the string "fnord"), even when the exact form of the sensory inputs is unknown at design time.
  • The system is given a sequence of sensory inputs (e.g. images of characters), but no labels mapping the inputs to classes.
  • Using only bigram frequencies of the abstract classes, the system learns to map sensory inputs to their correct classes.
  • This allows the system to reliably detect the target pattern, without relying on labels or modality-specific knowledge.

The authors demonstrate this for the case of detecting the string "fnord" (and 2,000 other trigger sequences) in a sequence of character images, achieving over 99% accuracy without using any labels during training.

Though simple, this provides an existence proof for encoding innate knowledge into AI systems that are agnostic to the exact form of their inputs. It suggests a path to building in specific drives or goals, without sacrificing flexibility and adaptability to new environments.

Discussion

By demonstrating a solution to a simplified version of the Ungrounded Alignment Problem, this work opens up new avenues for encoding innate drives into highly adaptable AI systems.

Some potential implications and future directions:

  • Creating AI systems with specific built-in drives or values (e.g. a drive to pick up trash), while still being highly adaptable to new environments.
  • Extending the approach to more complex sensory modalities and relational knowledge.
  • Exploring the robustness of this approach to distribution shift between training and deployment.
  • Scaling up to enable more open-ended "instincts" to be built in, beyond simple pattern detection.

This paper provides an important conceptual stepping stone. It gives a concrete example of how innate knowledge could be embedded into AI systems without sacrificing flexibility—something that is taken for granted in biological intelligence but often overlooked in artificial intelligence.

Conclusion 

"The Ungrounded Alignment Problem" introduces a novel and philosophically interesting challenge for encoding innate knowledge into adaptable AI systems. Though the demonstrated solution is simple, it provides a compelling proof-of-concept that opens up important new avenues for AI alignment research. We look forward to seeing how these ideas develop and scale to real-world AI systems.

Read the full paper on arXiv for more details.

More from the Journal