Why AI agent systems get messy — and why HEINI is built around order, roles, and approval

Open any enterprise-AI conversation in 2026 and you will hear the same three words: agents, fleets, orchestrator. Major platforms keep rolling out their own agent stacks — a new logo every week, a new framework every week. But if you read what the public research community is writing, the tone is quieter. And it is not about more models. It is about a different problem: order.

The problem that fewer people say out loud

A much-cited study by Cemri et al. (Berkeley, 2025) looked at several thousand traces from multi-agent systems and annotated where they went wrong. The result is unspectacular and clear at the same time: the more agents are meant to talk to one another, the more often they talk past one another. The more complex the handoffs, the more often ownership is lost. Specific categories of failure — misread tasks, unclear handovers, conflicting states — recur with regularity across frameworks.¹

A second public paper, this time from the systems-architecture community (SIGARCH, Yu & Zhao, January 2026), describes a related issue: common operating state between agents is still an open research problem. The paper notes, bluntly, that there are hardly any established patterns for how several agents can work reliably on the same work item without overwriting one another.²

Put the two papers next to each other and a pattern emerges. The current wave is not mainly about unsolved model problems. It is about unsolved organisation problems. And you do not solve organisation problems with yet another model. You solve them with structure.

What happens when structure is missing

Picture a new wing in an office building. Three capable assistants, all qualified, all eager to help. No receptionist. No inbox. No roles. All of them writing into the same files. All of them answering the same calls. All of them making decisions because they “can”.

That — simplified — is what many agent setups look like in practice. If every instance is allowed to write wherever it happens to be, if there is no clear intake layer, if handovers happen verbally and spontaneously, you get exactly what the papers above measure: overwrites, misunderstandings, duplicate work, dropped cases.

That is not the fault of the models. It is a design problem.

How HEINI sidesteps it — in language a managing partner understands

HEINI is deliberately built differently. Not because we think we are smarter, but because we come out of a document-heavy mid-market world where order is not ornamental — it is the precondition for anything else.

Three design choices shape the architecture:

1. A receptionist at the front. HEINI has a first, deliberately narrow layer — Susi. Her job is not to answer cleverly. Her job is to confirm incoming items, classify them, and hand them in the right direction. She is not an orchestrator who runs everything; she is the first orderly door. It sounds unexciting — that is exactly why it works.

2. Specialist colleagues behind her, with clear ownership. Behind Susi sit specialised AI colleagues grouped by role. Accounting, sales, shipping, customs — each role has a defined scope. None writes into another’s space. None approves what does not fall into its own scope. This separation is not a policy decision; it is wired into the software.

3. Approval stays. HEINI prepares cases, flags hints, and queues them for approval. No case leaves the house without a human in a role signing off on it. That is the opposite of a “fully automatic” logic — it is the opposite of a black box. Approvals are not an annoying brake; they are the point where human responsibility and machine preparation meet.

The result is a system that does not impress by “more AI”, but by less disorder.

What that means in daily operations

In a mid-market company running HEINI, you do not see a fleet of agents messaging each other. You see:

An intake layer that accepts cases calmly.
A visible path from request to preparation to approval.
An audit trail where every step can be read back — what HEINI suggested, why, from which source, which role decided.
Existing systems that stay in place: ERP, DMS, storage, inbox. HEINI works on top of them, not around them.

That is less spectacular than “Agent A calls Agent B, which in turn commissions Agent C”. It is also auditable. It is manageable. It is — in the full sense of the word — operationally usable.

Order is not a renunciation of intelligence

One misunderstanding comes up in almost every conversation. Some listeners assume that a setup with a receptionist, roles, and approvals is somehow less “modern” than a free-floating agent architecture. Our answer is calm: order does not reduce the intelligence of the underlying models. Order reduces the places where intelligence runs into a wall.

HEINI uses strong models. HEINI swaps them out when better ones appear. But HEINI does not run them loose. HEINI runs them in clearly defined roles, on clearly defined paths, with a clear approval at the end. That is not a renunciation of intelligence. That is the frame in which intelligence becomes usable in a European mid-market setting in the first place.

Why that matters to the people we talk to

The people we talk to are not Silicon-Valley early adopters. They are managing directors of mid-market companies, commercial proxies (Prokurist:innen), workshop leads, office organisers. Their questions are rarely “which model do you use”. Their questions are:

“Who approves it?”
“Where does it say afterwards who did what?”
“What happens if your AI gets it wrong?”
“Do my data stay where they are now?”
“Do I have to rebuild anything?”

HEINI has a direct answer to each of those. Not because we perfected the slides, but because the architecture was constructed in such a way that these questions remain answerable. Roles, approval, audit trail, data sovereignty, compatibility with existing systems — these are not marketing words. They are the answers to the five questions above.

What we explicitly do not claim

We do not claim to be the only ones thinking about these themes. We do not claim other systems are bound to fail. We only observe what the public research community currently describes as structurally critical, and we build in a way that addresses those points — not as an after-thought, but from the start.

Readers who want to go deeper can. In our security and enterprise pages we show how roles, approvals, and audit logging work in practice, where the data live, and how HEINI fits into existing system landscapes. For DD-style conversations, the press and contact channels are the way in.

Takeaway

The wave of agent platforms is real, and many of them will do useful work in specific slots. The actual difference starts where many agents turn into a team that can actually be led. That takes an orderly intake layer, clear roles, traceable handovers, and an approval that stays with the human. HEINI is built for that — not as a show, but as a calm, grown-up operating frame for mid-market businesses.

Cemri et al. — Multi-Agent System Failure Modes, arXiv:2503.13657. Analysis across multiple frameworks and over a thousand annotated traces. Cited in the NeurIPS 2025 Spotlight context. arxiv.org/abs/2503.13657 ↩
Yu & Zhao — Multi-Agent Memory from a Computer-Architecture Perspective, SIGARCH Computer Architecture Today, 20 January 2026. sigarch.org ↩