A new paper by researchers from Emergence, Openmind Research Institute, and Sakana AI explores how to encode specific innate knowledge into AI systems that are agnostic to the sensory modality of their inputs. This work provides a proof-of-concept for how intelligent systems could have built-in "instincts" that guide their behavior, even when the exact form of their sensory inputs is unknown at design time.
How can an AI system have hard-coded innate knowledge about the world, while still being highly adaptable to different sensory modalities and environments? Drawing inspiration from biological systems like beavers, which have an innate drive to build dams even when raised in human captivity, the authors of this paper ask: how can we build in specific desired concepts in an AI system where we don't know how a given stimulus will be grounded?
The paper introduces a formal version of this "Ungrounded Alignment Problem" and demonstrates a solution for a simplified case. Though the initial setup is simple, it provides an existence proof for how selected innate knowledge could in principle be embedded into highly plastic AI systems. This has implications for building AI with specific drives or goals, without sacrificing flexibility.
Key ideas:
The authors demonstrate this for the case of detecting the string "fnord" (and 2,000 other trigger sequences) in a sequence of character images, achieving over 99% accuracy without using any labels during training.
Though simple, this provides an existence proof for encoding innate knowledge into AI systems that are agnostic to the exact form of their inputs. It suggests a path to building in specific drives or goals, without sacrificing flexibility and adaptability to new environments.
By demonstrating a solution to a simplified version of the Ungrounded Alignment Problem, this work opens up new avenues for encoding innate drives into highly adaptable AI systems.
Some potential implications and future directions:
This paper provides an important conceptual stepping stone. It gives a concrete example of how innate knowledge could be embedded into AI systems without sacrificing flexibility—something that is taken for granted in biological intelligence but often overlooked in artificial intelligence.
"The Ungrounded Alignment Problem" introduces a novel and philosophically interesting challenge for encoding innate knowledge into adaptable AI systems. Though the demonstrated solution is simple, it provides a compelling proof-of-concept that opens up important new avenues for AI alignment research. We look forward to seeing how these ideas develop and scale to real-world AI systems.
Read the full paper on arXiv for more details.