Inside Emergence AI
Our technical team members share their work in agent science and other fields on the forefront of AI.
Learning API Functionality from In-Context Demonstrations for Tool-based Agents
Insights

Closing personalization performance gaps with memory
Engineering
Product

Agentic Compliance: AI-Driven Governance for the Enterprise
Strategies for improving user experience on your CMS-powered website.
Insights

State of the Art Results in Agentic Memory
Eager to apply more sophisticated agentic memory to the largest conversational benchmark, LongMemEval, we discuss the benchmark, our approach, our somewhat disappointing state of the art findings, and the need for a more comprehensive benchmark for agentic memory than LongMemEval.
Insights

SOTA on LongMemEval with RAG
LongMemEval stands out as a strong benchmark for long-term memory, but our RAG-like methods’ success shows it may not capture all aspects, pointing to the need for further benchmark development.
Insights

Agents Are Redefining Cybersecurity Resilience
Emergent AI agents revolutionize cybersecurity by automating threat detection, compliance, and SOC operations, enabling resilient and proactive defense.
Product
Engineering

Benchmarking Agents-Creating-Agents: How LLM Choices Shape Performance, Scale, and Quality
An empirical study of how different Generative Foundation Model pairings impact agent creation, verification, and emergent system behaviors across 40 enterprise tasks.
Insights
Product

Comparing LLMs for Planning and Code Generation in Data Science Agents
We benchmarked the latest LLMs from OpenAI, Anthropic, Deepseek, and Google within our Data Insights Agent framework to identify which delivers the most accurate, fastest, and most consistent insights.
Insights

Building Agentic Systems from First Principles Inspired by Unix and Kubernetes
We propose a first-principles architecture for agentic systems, inspired by Unix and Kubernetes, with nine core abstractions enabling runtime creation, delegation, and async execution.
Engineering

Beyond the Browser: Benchmarking the Next Generation of Enterprise AI Agents
Discover why standard AI benchmarks fall short for enterprise needs and how agent performance is truly measured on realistic, multi-application workflows using both UI and APIs.
Insights

Layer by Layer: A Structured Approach to Benchmarking AI Agents in the Enterprise
A structured five-layer framework provides standardized benchmarking for AI agent capabilities across the full spectrum of enterprise task complexity, from UI to infrastructure.
Insights

Towards Autonomous Agents and Recursive Intelligence
Emergence’s Agents platform is evolving into a system that can automatically create agents and assemble multi-agent systems withminimal human intervention.
Product

Ensuring Safe and Respectful Online Spaces: A Look at AI-Based Text Moderation
With the explosion of user-generated content, moderating digital conversations has become critical in keeping online platforms welcoming and respectful.
Product

Taming the API Jungle: The Connector Agent's Quest for Perfect API Calls
Connecting downstream enterprise applications to the world of external services is crucial, but navigating the vast API landscape can feel like exploring an uncharted jungle.
Product

Emergence Multi-agent Orchestration: Feb Updates
Since our Multi-Agent Orchestrator launch in December 2024, we’ve introduced new features and refined capabilities to streamline operations, strengthen compliance, and support evolving enterprise needs. Below are the highlights.
Product

Introducing the Emergence Multi-Agent Orchestrator
Today we are announcing the Emergence Orchestrator, an autonomous meta-agent that coordinates and manages interactions between AI agents across enterprise systems.
Product

Benchmarking AI Agents: Key to Building Trust and Driving Scalable Enterprise Adoption
As enterprises increasingly embrace AI agents to drive productivity and streamline operations, a pivotal question emerges: How do we ensure these agents deliver reliable, compliant,
Engineering

Benchmarking of AI Agents: A Perspective
This whitepaper examines the role of benchmarking in enterprise AI adoption, addressing reproducibility, bias, and applicability while outlining strategies for scalable benchmarks.
Engineering

Building Innate Knowledge into Modality-Agnostic AI Systems
A new paper by researchers from Emergence, Openmind Research Institute, and Sakana AI explores how to encode specific innate knowledge into AI systems that are agnostic to the sensory modality of their inputs.
Insights

Exploring the Functional Roles of Transformer Layers
A new paper, "Transformer Layers as Painters," co-written by Emergence researchers and Sakana AI, investigates the internal workings of transformers, focusing on how the removal or reorganization of information impacts pretrained models.
Engineering

Our Agent-E SOTA Results on the WebVoyager Benchmark
Intelligent agents are showing promise in transforming interactive software by improving multi-step task automation significantly across diverse digital environments.
Product

Reliable Synthetic Data Generation
As we’ve seen the rapidly rising impact of LLMs, we’ve also seen the growing importance of “synthetic data,” generated instructional raw text used to train LLMs for specific tasks without the need to mine from real human conversations.
Engineering

The Emergence of Emergence
Emergence is a compelling phenomenon observable both in natural systems and in engineered designs, where complex behaviors and patterns arise from simple interactions.
Insights

Beyond What Comes Next
In this post, we consider how to make language models better, not just faster, inspired by several papers.
Insights

Achieving Self-Improvement in Agentic Systems with Skill Harvesting
Skill harvesting allows agentic systems to self-reflect, autonomously developing more specialized skills.
Insights

MathViz-E - Agent Tool Control
At Emergence, we’ve always believed that the next significant advancement in workflow automation will come from the planning, selection, and use of multiple external tools by artificial intelligence.
Engineering

Distilling the Web for Multi-Agent Automation
Our everyday interactions with computers are filled with slow and repetitive tasks.
Product
Engineering

Building Narrow Self Improving Agents
A number of enterprise workflows involving language- and tool-control tasks can be augmented with LLM- and LVM-powered agents.
Product
Engineering

Self-Improving Agents
Self-improving agents have varying objectives, and the issue of aligning them with human values is critical.
Product

Introducing Emergence
The pivotal advancement in the ability of computers to understand language and develop functional world models has profoundly reset the landscape in computing.
Product
Engineering
Insights

Emergence’s Appropriateness Evaluation Model
The high accuracy and precision of our model represent a new achievement in reliably identifying unsuitable prompts and biased datasets.
Product

The Anatomy of Agents
The concept of a software agent can be traced back to the model Hewitt, et al.
Product
Insights
See what Emergence can do for you.
Let’s set up a demo