
HeartBeatAgents
Abstract
We present HeartBeatAgents, a self-contained, enterprise-grade architecture for autonomous agents. The contribution is architectural, not algorithmic. The paper advances a single claim. The middleware tier that every mainstream agent platform interposes between the agent and its capabilities is structurally adversarial to agent self-extension. Every capability an agent creates within a middleware architecture adds a protocol boundary and a consistency gap. The system degrades as it grows. This holds regardless of whether the middleware is human-authored or agent-generated; the cost is borne by the topology, not the implementation.
HeartBeatAgents resolves this by eliminating the middleware tier entirely and separating two concerns it conflated: authority and intelligence. The operator contributes scoped authority. The agent contributes intelligence. The platform serves as judge. Authority is scoped by the operator through a credential broker the agent cannot see into or influence. Intelligence is authored by the agents themselves as in-process capabilities, each verified by the platform against enterprise-grade standards the agent cannot lower. The result is a platform where the agent's capability is bounded only by the authority it has been granted and the standard it must meet, never by human-authored integration infrastructure.
A public live demonstration accompanies this paper and allows readers to observe the architecture operating in real time.
Contents
1. The Structural Problem with Middleware
The dominant pattern for giving an LLM-driven agent real-world capability is a middleware tier: a protocol-mediated layer between the agent's reasoning loop and its tools, operated as its own runtime artifact. Tool servers, API gateways, retrieval proxies, and credential sidecars are instances of the same structural choice: interpose a tier, mediate over a protocol.
This paper argues that the middleware tier is structurally adversarial to agent self-extension. This is a property of the topology, not of any particular implementation, and it cannot be resolved by improving the middleware or automating its generation.
1.1 The adversarial relationship
In middleware-oriented agent architectures, capability growth frequently manifests as growth in interface boundaries. A single middleware server may expose many tools through one boundary, so capability count and boundary count are not equivalent. However, the set of capabilities available to the agent is defined by the servers that exist. When the agent requires a capability that no existing server provides, a new server must be built and deployed, introducing a new boundary.
Each boundary carries costs that can be reduced through aggregation and automation but not eliminated entirely: communication overhead, state synchronization, authorization propagation, operational management, and additional surface that must be secured. These costs are properties of the boundary itself, not of how the boundary was authored or deployed.
Boundaries are not inherently costly. In security engineering, boundaries are frequently introduced as deliberate controls: to reduce trust, to isolate privilege, to constrain blast radius. The boundaries that middleware creates around capabilities are different. They are introduced for modularity and extensibility, not for isolation. No architect decided that a pagination skill for a CRM should be reachable only through a separate process with its own authorization domain because that improves security. It exists as a separate process because the middleware pattern organizes capabilities that way.
As a result, architectures that scale capability primarily through independently deployed services face a structural pressure: complexity grows at the granularity of capability categories, incidental boundaries proliferate alongside intentional ones, and the agent's reach remains bounded by the server catalog that exists at any given moment. Automating server generation addresses the authorship cost but not the boundary cost. The boundary exists because the architecture requires it.
1.2 Why automating middleware does not resolve it
A natural objection is that agents could dynamically generate middleware: creating tool servers, generating schemas, deploying adapters at runtime. This addresses the authorship constraint but not the architectural one. An agent-generated server still introduces a boundary when the capability falls outside the scope of any existing server. That boundary still carries the same irreducible costs: communication overhead, authorization propagation, operational lifecycle. The authorship changed. The topology did not.
More fundamentally, the agent's capability remains bounded by the server catalog, whether that catalog was assembled by humans or generated by agents. The agent can add tools to an existing server's scope, but it cannot escape the scope decisions embedded in the server's design without creating a new server. Automation accelerates the construction of the catalog. It does not change the relationship between catalog growth and boundary growth.
1.3 The conflation at the root
A middleware artifact conflates two concerns that are fundamentally different: authority and intelligence. Authority is the permission to access a service and the credential that proves it. Intelligence is the logic required to use that service effectively. In a middleware architecture, both reside in the same deployed artifact. The tool server holds the credential and the integration logic. This conflation is why the middleware tier appears indispensable. It appears to be a single concern. It is two, and they belong in different domains.
2. The Resolution: Separating Authority from Intelligence
HeartBeatAgents resolves the conflation by separating authority from intelligence and placing each in its own domain.
The operator contributes scoped authority. The agent contributes intelligence. The platform serves as judge.
No domain crosses into another. The operator never writes integration logic. The agent never sees a credential. The platform never relaxes its standard.
2.1 Authority: the operator's domain
Authority is handled by the operator and the credential broker. The operator connects a credential and binds it to specific agents. This binding is the authorization boundary. Each agent in the multi-agent system sees only the integrations an operator has explicitly scoped to it. An agent bound to a CRM and an enrichment service has no awareness that messaging credentials exist elsewhere in the platform.
The credential broker holds the decrypted token in-process and hands the agent only an opaque handle meaningless outside the broker's memory. The broker performs the authenticated request itself. The agent cannot read the credential, cannot choose how authentication is applied, cannot redirect it to an attacker-controlled endpoint, and cannot persist it past the run. The injection method is resolved from the server-side integration definition, never from agent input.
The operator's contribution begins and ends with granting scoped access.
2.2 Intelligence: the agent's domain
Intelligence is authored by the agents themselves, at runtime, as in-process capabilities. When an agent encounters an integration it has been bound to, it first searches the platform's existing skill library through semantic discovery. If a verified capability already exists, the agent reuses it. If not, it authors the logic required to use the integration effectively: constructing API calls, paginating results, handling rate limits, parsing error responses, transforming data for downstream consumption. The authored skill, once verified, is embedded into the same skill library for discovery by other agents and future runs. The authorship cost is paid once. The capability persists. When a skill fails in production, the failure is fed back through the same verification loop as a repair task, under identical governance. The maintenance cost is also absorbed by the agents, not by human operators. The platform is self-healing: authorship, verification, and repair are parts of the same loop.
This logic enters the system as an in-process capability, not as a deployed server or a protocol-mediated artifact. It runs in the same process, under the same authorization predicate, at the same composition cost as every other capability in the system.
Every capability the agent creates introduces no new protocol boundary, no new consistency gap, no new deployment artifact, and no new failure domain. Self-extension and the architecture are structurally aligned. This is the inverse of the relationship identified in Section 1.
2.3 Judgment: the platform's domain
The platform serves as the judge. Its operating assumption is that the model's output is untrusted until proven otherwise. Every agent-authored capability passes a verification loop whose integrity does not depend on the model being honest. The model generates. The platform judges. Structural constraints are enforced by static analysis, not by the model's intent. The capability contract is derived from observed execution behavior, not from the model's description. Promotion to production is gated by statistical evidence, not by the model's assertion of readiness. Repair re-enters the full loop with no shortcut and no relaxed standard.
At no point does the platform ask the model whether a capability works and trust the answer. The model can be wrong without the system being unsafe. The verification regime is model-independent. It would function identically if the generator were a different model, a different provider, or a human author.
3. Why This Is One Claim, Not Two
The elimination of middleware and the verification regime are not independent contributions. They are the same architectural move viewed from two directions, and neither is viable without the other.
Eliminating middleware without the judge produces unconstrained self-extension. An agent that can author arbitrary capabilities with no verification is a liability. No enterprise would deploy it.
Retaining middleware with the judge still produces the structural pressure identified in Section 1. The verification loop certifies each capability, but each certified capability remains a deployed artifact that adds a protocol boundary and a consistency gap. The system still degrades as it grows. The judge is necessary but not sufficient.
The contribution is that both happen simultaneously. The agent's capability is bounded only by granted authority and verification standards. The system does not degrade as the agent extends itself. The verification standard is not relaxed to make self-extension possible. Self-extension is not constrained to make verification manageable. Both hold at the same time, in the same architecture.
4. A Concrete Comparison
Consider an enterprise that needs agents to pull pipeline data from a CRM, enrich it with a third-party data service, and push a summary to a messaging platform, with pagination, rate-limit handling, and retry logic at each step.
4.1 The middleware architecture
In the middleware pattern, each service or capability category is typically surfaced through a tool server or adapter. A CRM tool server contains the query pagination logic, handles rate limits, and exposes a paginated-query tool. An enrichment tool server wraps the data API with retry logic. A messaging tool server handles formatting and channel resolution. An orchestration layer chains them.
The agent calls each tool across a protocol boundary. Each hop carries serialization overhead, its own authorization cache, and its own failure mode. The orchestration intelligence is either embedded in the prompt, where it is fragile and unreproducible, or implemented as another middleware artifact.
When the enterprise adds three more services, three more externally operated capability surfaces are introduced, unless the middleware collapses them into a larger shared server and accepts the broader scope and failure surface inside that server. Whether a human or an agent generates these artifacts, the structural cost follows the same pattern: more externally operated surfaces, more authorization domains that can drift, more deployment artifacts to maintain.
Middleware scales by externalizing capability into operated servers.
4.2 HeartBeatAgents
The operator connects credentials and binds them to the relevant agents. The agents discover or author the required capabilities. Each capability runs inside the platform runtime, through the same authenticated-request primitive, under the same authorization predicate, verified by the same loop. The broker mediates every external interaction. The orchestration logic is itself a skill: versioned, canary-tested, self-healing, composable.
No new per-skill protocol boundary. No new independently deployed authorization domain. No new service artifact for the enterprise to operate.
When the enterprise adds three more services, the pattern does not change. The operator connects credentials. The agents discover or author capabilities. The architectural cost does not grow.
HeartBeatAgents scales by internalizing capability into verified runtime skills.
5. Architectural Consequences
The architecture produces a set of composition properties that follow directly from executing every capability in one runtime under one verification loop.
5.1 Composition as a runtime guarantee
When all capabilities execute in-process and all are governed by the same verification loop, safe composition becomes a property of the runtime rather than a responsibility delegated to each skill author.
The guarantee that makes autonomous composition auditable is substrate-owned failure transparency. A capability that calls another capability and catches its failure, recovers gracefully, and reports success to its caller still cannot erase the evidence. The runtime maintains its own authoritative record of every subtree outcome and overwrites the capability's account with it. Every ancestor sees every descendant failure with full causal context. No capability, at any depth, can silently suppress a failure. This is enforced by the runtime, not by convention, and it holds regardless of what the skill's own error-handling logic attempts.
5.2 Supporting mechanisms
Around the failure-transparency guarantee, the runtime provides four additional composition properties. Each is built from known primitives. The contribution is their integration into a single runtime that governs agent-authored, LLM-orchestrated skill graphs.
Idempotent side effects. The LLM decides when to retry. The runtime ensures that a retried composition never double-executes a side-effecting operation. A three-tier deduplication precedence resolves an idempotency key for every dispatch, with per-skill time-to-live and semantic projection over nested arguments.
Per-input failure isolation. A single malformed input can disable a capability for all callers if the failure triggers a global circuit breaker. The runtime instead maintains per-input-shape breakers. A poison input is suppressed while every healthy input continues to execute normally.
Hash-chained composition lineage. Every composition decision is recorded in a per-run, cryptographically chained event log that supports replay and causal explanation of any composition outcome after the fact.
Emergent capability recommendation. The runtime observes which capabilities succeed together in production and assembles them into a co-execution graph. The ranking weights each successful pairing by the breadth of independent agents that have validated it, guarding against a single agent's repetition being mistaken for genuine consensus. These proven compositions are surfaced as suggestions the agent may adopt or ignore, never as prescriptions.
5.3 Authorization consistency
Because there is no per-skill protocol boundary between the authorization decision and the execution of the action, authorization resolves as a predicate evaluated on every dispatch within the same consistency domain as execution itself. Revocation takes effect on the next dispatch. There is no second system to propagate to. The authorization drift that arises in architectures where policy and execution occupy separate consistency domains, connected by a channel with non-zero latency, does not arise here. The predicate and the execution share one domain by construction.
5.4 Scaling properties
The scaling difference between the two architectures is structural and follows from their respective topologies.
In a middleware architecture, capability growth that requires new externally operated servers introduces costs that are inherent to the boundary: serialization overhead, an independent authorization domain, a deployment artifact, and a failure surface. These costs are additive and persist regardless of whether the server was human-authored or agent-generated.
In HeartBeatAgents, every new capability is an in-process skill executing through the same runtime, the same broker, and the same authorization predicate. Adding a skill does not introduce a new protocol boundary, a new independently operated authorization domain, or a new deployment artifact. The capability space available for composition grows with the number of verified skills, while the cost per composition step remains structurally constant: one in-process invocation under one authorization predicate.
6. Enterprise Deployment
The architecture produces several security and compliance properties as direct consequences. The credential broker treats the LLM as an untrusted principal with respect to credentials: the agent receives only opaque, run-scoped handles and cannot read, redirect, override, or persist any credential. Scoped binding prevents lateral movement across the multi-agent system. Every action and every composition decision is recorded in independent per-run cryptographic hash chains that are replayable and independently verifiable. Tamper-evidence holds within the database trust boundary, and we state that bound explicitly. The platform is self-contained: it requires no external tool servers, retrieval proxies, or credential sidecars, whether deployed on a single machine or across a Kubernetes cluster. These are rigorous applications of known patterns to the agent-as-adversary threat model. They are differentiators, not breakthroughs, and we label them as exactly that.
7. Honest Scope and Limitations
We claim no new model, reasoning algorithm, or training method. The individual primitives are standard and well understood. The contribution is their synthesis into a single architecture where authority, intelligence, and judgment occupy separate domains.
The middleware-free model accepts real trade-offs. It does not participate in shared protocol ecosystems of community-maintained tool servers. Capabilities are confined to a single language runtime. Crash isolation of agent-authored code relies on static gating and runtime guards rather than hard process boundaries.
We do not claim universal superiority. We claim this is the correct architecture for a self-contained, enterprise-deployed agent habitat where the platform owns the entire execution surface. For this class of system, the properties it produces are unachievable by any architecture that retains a protocol-mediated tier between authority, intelligence, and execution.
8. Conclusion
The middleware tier that the industry interposes between agents and their capabilities conflates two concerns that should never have been combined: the authority to access a service and the intelligence to use it. More fundamentally, it creates a structural pressure against agent self-extension. Every capability an agent creates within a middleware architecture introduces protocol boundaries and operational cost inherent to the topology. The system's complexity grows as the agent extends itself.
HeartBeatAgents resolves this by separating authority from intelligence and eliminating the middleware tier. The operator contributes scoped authority. The agent contributes intelligence. The platform serves as judge. No domain crosses into another. Agent-authored capabilities enter the system as in-process skills, not as deployed infrastructure, so self-extension and the architecture are structurally aligned.
The result is a platform where the agent's capability is bounded only by the authority it has been granted and the standard it must meet. The verification rigor does not relax. The security properties do not degrade. The architectural cost does not compound.
Related Work
Model Context Protocol
The Model Context Protocol (Anthropic, 2024) defines the canonical middleware pattern: capabilities are exposed through protocol-mediated tool servers and discovered through a shared interoperability layer. Our thesis argues against this topology on structural grounds, specifically that protocol boundaries introduce operational costs that are inherent to the architecture and create a scaling pressure between capability growth and system complexity. MCP optimizes participation in a shared ecosystem of externally operated capability providers. HeartBeatAgents takes a different approach: capabilities are authored and verified within the runtime itself, entering the system as reusable skills rather than externally operated infrastructure. The resulting capability space is bounded by granted authority and verification standards rather than by the catalog of available tool servers.
LangChain and LangGraph
LangChain and LangGraph provide orchestration frameworks for agent execution, tool invocation, and workflow composition. They operate at a different architectural layer than HeartBeatAgents, focusing on orchestration rather than platform-level concerns such as authority management, capability verification, and runtime governance. Consequently, they do not directly address the authority-intelligence separation discussed in this paper.
AutoGPT and autonomous-agent projects
AutoGPT and similar autonomous-agent projects demonstrated LLM-driven tool use and self-directed task completion. The self-extension capability shares the ambition but differs in mechanism. Our contribution is the verification regime and the architectural model under which autonomous capability growth does not introduce additional protocol boundaries, deployment artifacts, or authorization domains as new capabilities are created.
Capability-based security
Capability-based security (Dennis and Van Horn, 1966) and object-capability models (Miller, 2006) inform the credential-broker design. The application to LLM agents as untrusted principals, where the threat model is prompt injection rather than traditional privilege escalation, draws on established patterns and is not a novel security primitive.
Voyager
Voyager (Wang et al., 2023) demonstrated that LLM agents can author and reuse capabilities in simulated environments. HeartBeatAgents shares that ambition but operates under a different set of constraints: real credentials, real side effects, and enterprise governance requirements. Our contribution is not skill reuse itself, but the runtime architecture, verification regime, and authority model required to support self-extending agents in production environments.
References
- Anthropic. Model Context Protocol Specification. 2024. Available at: https://modelcontextprotocol.io
- J. B. Dennis and E. C. Van Horn. Programming Semantics for Multiprogrammed Computations. Communications of the ACM, 9(3):143-155, 1966.
- M. S. Miller. Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control. PhD thesis, Johns Hopkins University, 2006.
- S. Yao, J. Zhao, D. Yu, et al. ReAct: Synergizing Reasoning and Acting in Language Models. ICLR, 2023.
- N. Shinn, F. Cassano, A. Gopinath, et al. Reflexion: Language Agents with Verbal Reinforcement Learning. NeurIPS, 2023.
- LangChain. LangGraph: Build Resilient Language Agents as Graphs. 2024. Available at: https://github.com/langchain-ai/langgraph
- G. Wang, Y. Xie, Y. Jiang, et al. Voyager: An Open-Ended Embodied Agent with Large Language Models. NeurIPS, 2023.
The contribution is the synthesis.
Not a new algorithm but a new architecture for building and operating the system around the models. The runtime is the safety layer. The ceiling imposed by middleware is removed. The standard is raised. Both hold in the same architecture.
Mosaic Singularity • June 2026