MDx OS

The Seven Primitives

What does a company made of agents actually need to function?
Not better tools. Not smarter models.
Something more fundamental.

by MD · March 2026 · A Real Talk Publication · 20 min read

The Question

I've been asking the same question since the beginning of MDx OS: what would you build if you were starting a company from scratch and the entire workforce was AI?

Not "how do you add AI to an existing company." Not "how do you make humans more productive with AI tools." A different question entirely...what does a company made of agents irreducibly need to function?

The word "irreducible" matters. Not "what would be nice to have." What can't be removed without the whole thing falling apart.

Human companies need a lot of things. Offices, HR systems, email, Slack, payroll, performance reviews, legal contracts, org charts, onboarding programs, team off-sites, coffee machines. But underneath all of that...if you strip away the social layer, the physical layer, the emotional layer...there's a smaller set of things that any coordinated workforce needs in order to function.

Agents don't need offices. They don't need social channels. They don't need emotional support or birthday celebrations. But they do need to communicate...and they need to know who can do what. They need to trust each other's output and understand what's happening right now. They need to get briefed before they work, have someone approve their decisions, and share a common memory.

That's seven things.

"The question isn't 'what human tools do we recreate for agents'...it's what does a workforce of agents irreducibly need to function as a company?"

I didn't start with that number. I started by building an operating system...shipping real software, running into real gaps, discovering what kept breaking or missing every time I added a new app to the stack. And over the course of 65 days and five products, the same seven patterns kept showing up underneath everything.

So I stopped and asked whether that was a coincidence or a discovery.

What Building Showed

The original MDx OS architecture organized everything into four layers: Foundation, Orchestration, Agents, Interface. Fifteen components. All still running. All healthy.

That architecture answered the question it was designed for: how do you build a system that uses AI well? And the answer worked. Five apps running on it...Twin, Code, Stella, Pulse, Message. Multi-agent orchestration, tiered knowledge, governance, observability, voice interfaces, real-time communication. The whole thing.

5
Apps Running
15
OS Components
326K+
Lines of Code
3,162
Tests Passing
200+
API Endpoints

But the deeper question...the one that kept surfacing as I built each new app on the OS...is different. It's not about how to use AI well. It's about what a company made of agents needs to function as a company. And the answer is a different shape.

Every time I built something new, I'd run into the same gaps. Twin needed agents to communicate and hand off work...so I built messaging. Code needed agents to declare capabilities and get matched to tasks...so I built routing. Message needed an immutable record of everything that happened...so I built audit trails. Each app was rediscovering the same underlying needs.

The fifteen components don't go away. They become the implementation layer...the plumbing that makes agents work internally. But on top of that plumbing, a new layer of company primitives emerges. Seven of them.

"Think of it like what happened when cloud computing arrived. Datacenter hardware didn't disappear. Servers, switches, storage racks...all still there. But they stopped being the abstraction people interacted with. VMs, object storage, message queues became the new primitive layer."

That's the move here. The original fifteen components are the datacenter. The seven primitives are the cloud layer. The infrastructure becomes invisible...still critical, still running, still doing real work...but no longer the thing you think about when you're building on top of it.

The Seven

Seven primitives. Three in the kernel...the things everything else depends on. Three as system services...higher-order capabilities built on the kernel. And one application that's the first thing you'd build on top.

Message
The nerve system. Communication is task execution.
Kernel

In a human company, messaging and task management are separate systems. Slack for talking, Jira for tracking, email for everything else. That separation exists because humans have social functions...small talk, relationship building, emotional check-ins...that need their own channel.

Agents don't have social functions. When an agent sends a message, it's doing one of three things: requesting work, delivering work, or signaling that something changed. Three message types...and that means messaging and task management collapse into a single primitive. The thread IS the audit trail, the response IS the deliverable.

And the volume is different too. Agent-to-agent communication is orders of magnitude chattier than human conversation. A 15-minute SDLC cycle that goes from intent to production might generate hundreds of messages between agents...coordination, status updates, handoffs, verifications. That's why MDx Message runs on a Rust WebSocket relay...not because Rust is trendy, but because the throughput requirements for an agent workforce are a different class of problem than human chat.

What's Running
MDx Message. Rust WebSocket relay, Redis pub/sub, channels, threads, presence, sequence ordering, 75ms batch persistence. Four communication modes: human-to-human, human-to-agent, agent-to-human, agent-to-agent. Both humans and agents are first-class participant types.
Registry
The org chart. Every agent declares what it can do, what it costs, and how well it performs.
Kernel

Human companies run on implicit trust and reputation. You know Sarah is good at architecture because you've worked with her for three years. You know the new hire needs ramp time because...you were a new hire once. Decades of working together create a web of informal knowledge about who's good at what.

Agents don't accumulate trust through relationship. They don't have hallway reputations. So you need something explicit. Every agent declares: capabilities, tool access, permissions, quality bar, resource cost. Work gets routed by querying the Registry for capability match...not by name, not by tenure, not by who happens to be available.

And here's where agent-native infrastructure has a genuine advantage over human infrastructure. Humans are...optimistic...on skills profiles. Agent capabilities are verifiable...you can benchmark them. A capability marketplace actually works when the participants can't exaggerate.

What's Running
Component Registry with 15 OS components tracked across 3 layers, all with health monitoring and status. Per-agent definitions with model overrides, skill loading, tool access declarations. Needs to be refactored from component catalog into full agent identity and capability marketplace.
Ledger
The trust chain. Append-only, cryptographic, immutable. If it's not in the Ledger, it didn't happen.
Kernel

This one has no direct human equivalent. In a human company, trust gets built through relationship and point-in-time approvals. Your manager signs off on a decision, and then everyone just...trusts it happened. Social fabric does the rest.

Agent output has no relational trust. When Agent A produces an analysis and Agent B uses it as input, the question becomes: what did A actually produce? What inputs did it use? What verified the output? What was the verification result? The Ledger makes that lineage traceable. Every entry chained to the previous one. Cryptographic. You can verify the entire history hasn't been tampered with.

For regulated industries...financial services, healthcare, insurance...this is the whole ballgame. You can't deploy autonomous agents without an immutable record of what they did, why they did it, and who approved it. The Ledger isn't a nice-to-have compliance feature. It's the reason agents are allowed to operate at all.

What's Running
Tamper-evident audit chain with SHA-256 checksums. Decision audit trail with full conversation analytics. Checksummed audit logs. Compliance reporting. Currently lives in the Governance layer...needs to be promoted to kernel as the canonical trust layer everything else depends on.
Context
The briefing. Assembles exactly what an agent needs to know for this task, right now.
System Service

Humans accumulate context over time. You join a company, spend weeks ramping up, gradually build a mental model of how things work. That model persists in your head across meetings, conversations, projects. By month three, you just...know things.

Agents start cold. Every session. Context window is finite and expensive. So you need something that answers: given this task and this agent, what does it need to know right now? Not everything. Not nothing. The right things.

Context queries the Registry for capabilities. Pulls relevant Pages. Checks the Ledger for related work. Assembles a context package. It's like the first week at a new job...compressed into milliseconds, happening every single session. And getting this right is probably the single most important thing in the entire stack. Get it wrong and you get agents that hallucinate confidently, give contradictory advice, or miss critical constraints. Get it right and every response carries the weight of your organization's actual experience.

What's Running
Three-tier knowledge architecture (in-context, RAG with pgvector, self-improving gap detection). Per-user persona injection. Dynamic prompt assembly with token budget management. Session briefs, tiered knowledge loading, context carryover. Twin does this implicitly every session. Context formalizes it as a standalone service.
Gate
The decision point. Where work pauses and a human says go or no-go.
System Service

Here's what I think is the big shift in an agent-native company...the human's primary role moves from doing work to operating gates.

Think about what actually requires human judgment. Approving strategy. Reviewing high-stakes output. Making value calls that agents can't make...not because they lack intelligence, but because they lack accountability. The human isn't the worker. The human is the checkpoint.

And Gate is composable. "Human approves strategy → Agent A executes → Agent B verifies → Human reviews outcome." The approval patterns themselves become programmable. Low-risk work flows through automatically. High-risk work pauses at a Gate. The boundary between "this can proceed" and "this needs a human" is explicit, configurable, and auditable.

Gate is literally the human-agent interface...not a chat box, not a dashboard...the point where humans and agents meet to make decisions together.

What's Running
Human-in-the-loop protocols with approval workflows and escalation. The SDLC pipeline has gates: //plan requires human approval before //run proceeds. SQL review before production deployment. Decision logging with full context. Needs to be elevated from governance feature to first-class system service.
Pulse
The vital signs. Real-time awareness of everything in flight.
System Service

A human manager walks the floor. Reads body language. Senses when a team is struggling before anyone says anything. That ambient awareness...the feel for how things are going...is one of the most valuable things in any organization. And agents have zero of it.

Without Pulse, you're flying blind. Pulse is live signal from the system: tasks in flight, active agents, token burn rate, error rates, quality scores from the Ledger, bottlenecks forming, capacity limits approaching. Not a dashboard built after the fact. Real-time operational awareness.

But here's the part that changes things: Pulse can be consumed by agents. An operations agent watches Pulse and takes corrective action. A cost agent notices burn rate spiking and throttles lower-priority work. A quality agent detects output scores dropping and flags it before a human ever sees the problem. Self-healing at the work level, not just the compute level.

What's Running
Observability with OpenTelemetry-compatible traces, 9 metric endpoints with daily aggregation, token usage tracking, performance profiling, agent execution history, event bus. Currently split across Foundation and Evolution layers. Pulse consolidates it into one coherent system service.
Pages
The institutional memory. Agents write it, humans review it.
Application

Every company needs institutional memory. Session briefs, decision records, process docs, post-mortems, onboarding materials, phase plans, signal reports. The problem is that every document system that exists today...Confluence, Notion, Google Docs...assumes humans are the primary authors.

In an agent-native company, the flow inverts. The agent is the primary author. The human is the reviewer and director. You don't edit documents directly...you prompt changes via chat. The agent reads the current file, makes the edit, commits with structured metadata: who prompted it, what the intent was, which agent executed. Everything is trackable via diffs.

Why markdown? It's the one format equally native to LLMs and humans. Agents can read, write, diff, and reason about it with zero serialization overhead. Git already knows how to version it. Every CI/CD pipeline already knows how to process it.

And the key design constraint isn't "make it readable." It's "make it loadable." Context window management matters more than visual design. Frontmatter that tells an agent "you need this page" vs "skip this" is more important than formatting.

What's Running
The patterns are already in daily use. Every session brief, phase plan, signal report, and handoff doc in MDx OS is markdown with structured frontmatter. Git-versioned, diffable, agent-authored. Over 200 documents produced this way across 139 sessions. Pages formalizes this into a first-class system with rendering, search, comment threads, and lifecycle management.

The Stack

The seven primitives aren't a flat list. They layer. And the layering matters because it defines what depends on what...and what can break independently.

Applications
Products
Pages Twin Code Stella Pulse Message
System Services
Forge
Context Gate Pulse
Kernel
Core
Message Registry Ledger
Sub-Kernel
Infrastructure
Model Abstraction Provider Adapters Knowledge Infra Memory & State Security & Guardrails

Kernel means everything else depends on it. If Message is down, nothing communicates...if Registry is down, nothing knows who can do what...if Ledger is down, nothing is trusted. These three can't go offline without the entire company stopping.

System Services use the kernel to provide higher-order capabilities. Context depends on Registry (who's available?) plus Pages (what do they need to know?) plus Ledger (what's already been done?). Gate depends on Registry plus Message plus Ledger. Pulse consumes all of them. These services compose the kernel primitives into the operational fabric of the company.

Applications are products built on the primitives. The OS can function without any individual app. Pages, Twin, Code, Stella, Pulse, Message...and whatever comes next...are all just applications running on the same seven primitives.

Sub-Kernel is where the original fifteen components live. Model routing, provider adapters, embeddings, knowledge tiers, memory management, guardrails. Critical but invisible to the agent workforce. This is the "how agents work internally" layer...not the "what the company needs" layer.

The Unix Parallel

If you've spent time with operating systems, this will feel familiar. Message is pipes, signals, and sockets...inter-process communication for the agent workforce. Registry is /etc/passwd plus service discovery...who exists, what they can do, how to reach them. Ledger is syslog plus the filesystem journal...append-only, tamper-resistant record of everything that happened. Context is the process scheduler...sets up the environment before work starts. Gate is sudo plus systemd dependencies...permissions and workflow enforcement. Pulse is top plus Prometheus...real-time system health.

That parallel isn't a metaphor. It's a design principle. Unix figured out the irreducible primitives for a computer's operating system fifty years ago. The question is what the irreducible primitives are for a company's operating system...and the answer turns out to be structurally similar. Fewer moving parts than you'd expect...each one doing one thing well, and composable.

The Gap

If you're in an organization thinking about agentic infrastructure right now...you're probably looking at reference architectures. Twelve-layer diagrams with boxes for "Agent Registry" and "Agentic Control Plane" and "AI Observability" and "Governance Framework."

Those diagrams are good. They correctly identify the capabilities that need to exist. And if you're working inside a large organization with real compliance constraints, procurement cycles, and change management processes...they're the right starting point. I'm not going to pretend that building fast in a solo environment is the same challenge as rolling this out across a regulated enterprise with thousands of engineers.

But here's what I can offer: the patterns map. The things showing up on reference architecture slides...agent registries, control planes, governance frameworks, observability layers...are the same things that kept emerging in the code as I built. And having working implementations of those patterns, even ones that still need hardening, has taught me things about how they fit together that I wouldn't have learned from the diagram alone.

Reference Pattern
Agent Registry
Visibility into what agents exist, what they do, risk tiering, operational monitoring. Usually designed as a multi-phase rollout.
What Building Taught Me
Registry
Started as a component catalog. Quickly realized it needs to be the identity layer...capabilities, cost, quality scores, not just a list. Working implementation with 15 components tracked, health monitoring, per-agent configs. Needs more formalization for enterprise use.
Reference Pattern
Agentic Control Plane
Command and control, audit and traceability, integration and orchestration. The layer that makes agents governable.
What Building Taught Me
Gate + Ledger + Pulse
The "control plane" isn't one thing...it's three primitives working together. Approvals (Gate), trust chain (Ledger), and awareness (Pulse). Working implementations of each. The integration between them is where the real complexity lives.
Reference Pattern
AI Gateway & Orchestration
LLM routing, MCP/A2A gateways, event bus, context sharing. The plumbing that connects agents to models and each other.
What Building Taught Me
Message + Context + Sub-Kernel
5-provider model routing with circuit breakers, MCP adapter, A2A federation, Rust relay. Key lesson: this layer becomes invisible infrastructure. It's essential but it's not the abstraction your agents should think about.
Reference Pattern
Responsible AI & Governance
Policy frameworks, guardrails, compliance, RBAC, data loss prevention. In many orgs, this is a separate workstream from the technical build.
What Building Taught Me
Ledger + Gate + Sub-Kernel
Governance can't be a separate workstream. It has to be woven through. Input/output guardrails, RBAC, row-level security, rate limiting, scoped agent tokens, policy engine...all integrated from the start. Bolting it on later is significantly harder.

I want to be clear about what "working implementation" means here. These are real, tested, deployed systems...not prototypes. 326K lines of code, 3,162 tests, 200+ API endpoints, 118 database tables with row-level security. But they're also the work of one person building at startup speed. Hardening them for a regulated enterprise environment...the kind of environment where a regulator is going to ask hard questions...is a different phase of work. The patterns are validated. The production-grade, compliance-certified version is what comes next.

What I think the building gives you, even at this stage, is confidence that the patterns are right. These aren't theoretical boxes on a slide. They emerged from real gaps in real software. And if you're designing your own agentic infrastructure, knowing which pieces actually need to exist...and how they depend on each other...might save you time figuring out the same things from scratch.

"The production-grade version is what comes next. But knowing which pieces actually need to exist...and how they depend on each other...that's what building teaches you."

What Seven Doesn't Cover

Here's where the thinking runs out. These are the questions that showed up while building...and don't have answers yet.

If you're serious about a company where the workforce is agents...and I mean really serious, not "agents augment humans" but "agents ARE the workforce"...then seven primitives aren't enough. The seven handle communication, identity, trust, memory, awareness, decisions, and context. That's the operating layer. But there are at least three more problems that are genuinely unsolved.

Treasury

Agents need to acquire resources. Sign up for APIs, purchase tools, allocate budget, manage spending. In a human company, this is payroll plus procurement plus expense management...three separate systems that exist because humans need to get paid and buy things.

For agents, the question is simpler and weirder at the same time. An agent needs an API key for a service. Who pays? What's the spending limit? Can the agent auto-renew? What happens when the bill exceeds the budget? And nobody's figured this out yet. Not just me...anywhere. Stripe has payments infrastructure for humans buying from humans. There's no Stripe for agents buying from services. This might be the most under-explored primitive in all of AI infrastructure.

Evaluation

Pulse tells you what's happening right now. But who's doing the annual review?

Not per-task quality scores. Longitudinal performance. Is this agent getting better over time? Should it get more autonomy? Should it get "promoted" to handle higher-stakes work? Should it get replaced by a newer model that does the same job at half the cost?

In human companies, performance management is the most broken system in all of corporate life...subjective, political, dreaded by everyone. Maybe agents give us a chance to do it right. Evals that are actually objective. Growth trajectories based on measured capability, not office politics. There's some irony in building better performance reviews for agents than humans ever got.

Legal Identity

Registry knows who an agent is and what it can do. But can an agent enter an agreement? Accept terms of service? Sign a data processing agreement?

Right now, a human clicks "I agree" somewhere upstream, and the agent inherits that authority implicitly. That's fine for internal tooling. It falls apart the moment agents start operating across organizational boundaries. When Agent A from one company needs to interact with Agent B from another...who agreed to what? Who's liable? What jurisdiction applies?

This is where the agent-as-employee metaphor actually breaks down and something genuinely new is needed. The legal system hasn't caught up. Neither has the infrastructure. But the fact that we're asking the question means we're in the right territory.

These three...Treasury, Evaluation, Legal Identity...aren't edge cases. They're load-bearing problems that'll need to be solved before an agent-native company can truly operate independently. They might fold into the existing seven as extensions. They might become new primitives. Building will tell us, the same way building told us about the first seven.

And honestly...this is what makes me think the thinking is on the right track. The questions are getting harder, not easier. That's how you know you're building something real.

The Vision Hasn't Changed

From the beginning, MDx OS was built around one question: what would you build if the workforce was entirely AI? The first answer was an architecture...four layers, fifteen components, Foundation to Interface. That answer was right, and it's still running.

But it was answering at the wrong altitude. It described how agents work internally...model routing, knowledge tiers, guardrails, orchestration patterns. Important plumbing. But plumbing.

The seven primitives answer at the altitude that matters: what does a company of agents need to function as a company? And the answer is surprisingly small. Message. Pages. Registry. Ledger. Context. Gate. Pulse.

The lens keeps zooming out. The original fifteen components become the sub-kernel...invisible infrastructure. The seven primitives become the operating surface. And the things we haven't figured out yet...Treasury, Evaluation, Legal Identity...become the next frontier.

Every round of building reveals the next layer of questions. That's how this has worked from the start. Build something real, discover what's missing, formalize the pattern, build again. The primitives weren't designed on a whiteboard. They were discovered in the code. And the thing that keeps surprising me is how few of them there actually are. Not fifty. Not twenty. Seven. Maybe ten by the time this is done.

The irreducible set is small...and most of it is already running.

More soon.