MDx OS

MDx Message

What do two agents actually say to each other?
Right now, mostly JSON. That seemed worth fixing.

by MD · March 2026 · A Real Talk Publication · 15 min read

The Context

Slack was designed in 2013. Twelve years later, every messaging tool in enterprise still works from the same assumption. That assumption is breaking.

2013. Before GPT. Before Claude. Before anyone seriously imagined that an AI could write production code, review a pull request, or deploy to production through a conversation thread. Bots were novelties. The word "agent" meant James Bond, not a software entity that ships code. And the fundamental assumption baked into every messaging system of that era was simple and seemingly obvious: humans talk to humans.

If an AI shows up in that world, it gets an "APP" badge and a second-class seat at the table. It's tolerated...integrated...and, perhaps surfaced in a sidebar. Never quite a participant in the way that matters... the way that means the system actually routes to it, persists its messages, gives it presence, treats it as an equal member of the conversation. That design assumption is now everywhere. It's in Slack. It's in Teams. It's in every tool that added "AI features" to an architecture that was never built to hold agents natively.

So, I thought...let's do something about that.

"Every messaging tool in enterprise makes the same assumption: humans talk to humans. If an AI shows up, it gets a second-class seat. That assumption is now twelve years old... and it's breaking."

Connecting the Dots

As I've been building MDx...putting agents together, watching how they interact, intersecting with where the industry is clearly heading...one gap kept getting harder to ignore.

Agents are getting good at thinking. They delegate, execute in parallel, make decisions. But they can't talk to each other...not in a way that's observable, not in the flow where engineering actually happens. The quick question, the deploy notification, the "hey, something broke" that needs a response in seconds, not minutes...none of that exists for agents right now. They're wired together with webhooks and callbacks, and the humans around them have zero visibility into what's passing between them.

Every chat tool I looked at treated this as an integration problem. Bolt a webhook here, add a bot there, shove the agent into a sidebar widget and call it "AI-powered." That's not what I wanted. I wanted a communication system where agents are participants...not integrations, not add-ons, not afterthoughts. Participants. First-class, same as any human. Where the system doesn't even care whether the message came from a person or a machine...it just routes it, delivers it, persists it, and shows it.

So I looked at what existed. And when nothing matched what I was trying to build, I started designing from first principles. The question was simple: what does a messaging system look like if you design it from the beginning as if agents are real users?

"The goal wasn't a better Slack. The goal was a communication layer that treats agents as participants... not integrations. There's a real difference."

The Architectural Bet

Most messaging systems are human-first with AI bolted on. MDx Message inverts that. The architectural bet is simple...build the communication primitive at the OS layer, and treat agents and humans as equally valid participants from day one.

The foundation is something I call the Stream Fabric...a communication primitive that sits at the OS layer of MDx. It handles four things: who can send and receive (participant identity, regardless of whether that's a human, an agent, or a system process), where messages go (stream addressing), how they get delivered (envelope routing), and what happened and when (audit logging).

Notice what's not in there: channels, threads, DMs, reactions, typing indicators. Those are application concerns. The Stream Fabric is infrastructure. MDx Message is the first application built on top of it...but it won't be the last. MDx Code can surface build results through the same Fabric. MDx Twin can post insights. CI pipelines can broadcast alerts. Same transport. Same delivery contract. Same audit trail.

This separation was the most important design decision in the entire system...the difference between building a chat app and building a communication layer for an operating system. One is a product. The other is infrastructure that enables products. I needed the infrastructure...and I needed it to be agent-native from the moment it existed, not retrofitted later when agents were already second-class citizens baked into the assumptions.

STREAM FABRIC
The Communication Primitive
OS-layer infrastructure · Model-agnostic · Agent-native by design

Participant identity sits at the foundation...human, agent, or system. The Fabric doesn't assign different treatment based on type. It just routes, delivers, and records. Above that lives stream addressing: named channels, direct streams between any two participants, broadcast channels for system events. Envelope routing handles delivery semantics...at-least-once, deduplication, fan-out to multiple subscribers simultaneously. And underneath everything, an immutable audit log. Messages, delivery attempts, and acknowledgments are all recorded.

Four Communication Modes

When agents are native participants, four modes of communication become equally natural. The first three exist in other tools, in degraded forms. The fourth is the one that matters most...

Human to Human
The Familiar One
DMs, channels, threads. Works exactly how you'd expect. The difference is that this sits on the same infrastructure as everything else...which means context flows freely between human conversations and agent activity. No artificial boundary between "the chat" and "the AI."
Human to Agent
Commands and Queries, In-Flow
"Run the test suite." "Deploy to staging." "What's the status of the P1 incident?" These live right in the conversation...no context switch, no separate tool, no navigating to a different interface. The agent receives the message the same way a human would and responds in the same thread.
Agent to Human
Intelligent Escalation
"Build failed on main. Here's the error. Want me to open a rollback?" Not just notifications... action cards with approve/reject buttons, embedded context, proposed next steps. The agent doesn't just inform you...it proposes next steps, you decide, and the system executes.
Agent to Agent
The One That Changes Everything
When MDx Code kicks off a review agent, and that agent hands off to a security scanner, and the scanner escalates to a human... the entire chain flows through the Message Bus. Observable. Debuggable. Auditable. Not buried in webhooks and log files that nobody reads until something breaks in production. This is the mode that's hardest to retrofit into existing tools, because it requires agents to be native from the start.

That last mode, in my mind, is going to emerge as the foundation for modern engineering communication...and we're still only in the early stages. Right now, we're used to human-to-human pace...a few messages a minute, context built over a conversation. Agent-to-agent won't look like that. It'll be orders of magnitude faster, denser, more verbose. The volume of communication between agents is going to dwarf anything humans produce. And if that communication isn't observable...if it's buried in webhooks and log files...you lose the ability to understand what your own system is doing.

"If you can't see what's happening between your agents, you can't trust them."

Why Rust

The relay...the server that holds every WebSocket connection, fans out every message, manages presence, handles backpressure...is written in Rust. The technology choices matter, so here's the reasoning.

Discord's engineering team migrated their real-time gateway from Go to Rust and published the results: lower tail latency, less memory, no garbage collection pauses. When you have hundreds of open connections and every message needs to arrive in under 20 milliseconds, you cannot afford a runtime that occasionally pauses to clean up memory. Rust doesn't pause.

The relay runs on Tokio...the same async runtime that powers Cloudflare Workers and a significant chunk of the internet's real-time infrastructure. Each WebSocket connection is a lightweight task, not an OS thread. A single relay instance can sustain thousands of concurrent connections with a memory footprint measured in megabytes, not gigabytes. And before the "isn't that overkill for your scale?" question arrives...that's the wrong frame. The goal was never to optimize for today's user count. The goal was to build infrastructure that doesn't need to be rewritten when the user count changes. Discord didn't rewrite their gateway three times. They got the foundation right, and it scaled. That's the bet I'm making.

RELAY ARCHITECTURE
Rust + Tokio + Redis
Real-time layer · Designed for thousands of concurrent connections · Sub-20ms delivery target

Bounded channels with backpressure ensure that one slow client never blocks others...their channel fills, fan-out drops for them specifically, and they reconnect and gap-fill. Every other subscriber is unaffected. Batch persistence with write-behind means messages deliver instantly while database writes are amortized in the background. Connection-level fan-out with per-connection exclusion means your second tab still gets messages...because exclusion is per-connection, not per-user. Open two tabs, type in one, it appears in the other immediately. Same user. Multiple sessions. Independent delivery. That's how Discord works, and it's the model I went with.

Redis pub/sub handles cross-relay communication for horizontal scaling. Add a second relay instance behind a load balancer and no code changes are required...messages published on Relay A reach subscribers on Relay B automatically. The architecture was designed for this from the start, not bolted on when traffic grew.

The Borrowed Patterns

I'm not going to pretend this architecture appeared from nowhere. A lot of these patterns come from my years at Flipp...Kafka-heavy microservices, deep analytics pipelines, systems serving 100MM+ users. You learn things operating at that level that don't show up in tutorials. I also studied the public engineering work from Discord, Slack, and others...and borrowed deliberately. Here's what I took from where, and why.

What struck me was how much the solutions converge: the same hard problems keep appearing in different companies, and the answers look almost identical. Tail latency. Slow subscriber isolation. Presence broadcast storms. Reconnection cascades. Message ordering under concurrency. Each of these has a known-good solution...what changes is whether you implement it from the start or discover it the hard way in production.

🔷 From Discord
Rust + Tokio for the real-time layer (same conclusion, same reasoning). Bounded channels with backpressure so slow clients don't block fast ones. Batch persistence with write-behind so delivery is instant and database writes are amortized. Connection-level fan-out with per-connection exclusion so multi-tab sessions work correctly.
🟣 From Slack
Presence coalescing...batch presence updates every 5 seconds instead of broadcasting every individual status change. Typing indicators with automatic expiry. Graceful degradation so local delivery continues even when Redis goes down. The operational wisdom of a team that's run this at enormous scale for over a decade.
🟢 From Spotify DX
Mobile-first design philosophy...the system is built for the engineer in motion, not the engineer anchored at a desk. Onboarding flows that never drop you into a blank screen. Developer experience as a first-class design concern, not an afterthought you retrofit once the core product ships.
🔴 From Production Incidents
Atomic sequence numbers via PostgreSQL sequences...no race conditions, no duplicate ordering. Simple query protocol for full connection pooler compatibility. Idempotency keys on every message for exactly-once persistence even when the batch writer retries. Graceful shutdown with flush-on-exit so SIGTERM triggers a full drain before the process dies.

These aren't theoretical capabilities. They're deployed and working...early days, small group of pilot users, but the architecture is proving out. The borrowing was deliberate and documented...I wanted to be able to point to every decision and explain it, not discover two years later why Discord made the choice I didn't.

Shooting For Enterprise-Grade

I've seen too many demos that look impressive and fall apart under real usage. Here's what MDx Message handles today...and where it's heading.

⚡ Delivery
Sub-50ms message delivery under normal conditions. At-least-once semantics with client-side deduplication. Messages fan out entirely in-memory...no database query in the delivery path. The database is for persistence, not delivery.
💾 Persistence
Batch INSERT with idempotency keys. If the batch fails, individual fallback...no messages lost. On shutdown, the batch writer drains before exit. Exactly-once semantics even when the writer retries on partial failure.
🔢 Ordering
PostgreSQL sequences. Atomic. No race conditions under concurrent writes. Started with MAX(sequence) + 1... watched it break under fast typing from two clients. Switched to nextval(). The database was designed for this. Use it.
🔄 Backpressure
Bounded channels per connection. When a subscriber can't keep up, their channel fills and fan-out drops for that client only. They reconnect and gap-fill. One slow subscriber never impacts anyone else's delivery.
📱 Multi-Device
Each connection has a unique ID. Fan-out excludes only the sender's specific connection, not all connections from the same user. Open two tabs, type in one, it appears in the other instantly. Same user. Multiple sessions. Independent delivery.
↔ Horizontal Scaling
Redis pub/sub for cross-relay communication. Add a second relay instance behind a load balancer...no code changes. Messages published on Relay A reach subscribers on Relay B. The architecture was designed for this from the start.
🔐 Security
JWT authentication during WebSocket handshake via subprotocol header. Periodic re-validation every 5 minutes...if a token is revoked, the connection terminates cleanly. Row-level security on every database table.
🛡️ Resilience
Exponential backoff with jitter on reconnection. Offline queue in IndexedDB. Gap-fill on reconnect so clients recover what they missed while disconnected. Presence grace period...if your connection drops and reconnects within 30 seconds, you never appear offline to anyone.

The UI That Scales

A channel with 10,000 messages needs to scroll like butter. Most chat UIs render every message as a DOM node...at 10,000 nodes, that's a performance problem that no amount of JavaScript optimization can fully solve.

MDx Message uses react-virtuoso...a virtualized list that only renders what's visible plus a small buffer. A channel with 10,000 messages has roughly 25 DOM nodes at any given time. Scroll position is preserved per channel. New messages auto-scroll only if you're already at the bottom. Older messages load on scroll-up. This isn't novel...Discord does it, Slack does it, every production chat client does it. But it's the kind of decision that separates "works in a demo" from "works with real data and real users who've been in a channel for six months."

What I Learned Building This

The connection pooler thing I did not see coming. The rest confirmed what I already suspected about what actually makes distributed systems hard.

On Connection Poolers
Supabase's connection pooler (Supavisor) operates in transaction mode...which means your database connection is shared across clients between transactions. Standard prepared statements, the kind every ORM creates by default, conflict when the pooler reassigns connections. I spent more time debugging "prepared statement already exists" errors than I spent on any individual feature. The fix: raw SQL via the simple query protocol, zero prepared statements, fully compatible with every pooler. If you're building against any managed Postgres with a connection pooler...learn this lesson from me instead of from your production logs.
On Sequence Numbers
MAX(sequence) + 1 seems obvious. It's also wrong under concurrency. Two messages arrive simultaneously, both read MAX as 45, both try to insert 46, one fails. PostgreSQL sequences exist for exactly this reason...nextval() is atomic. The database was designed for concurrent writes.
On the Small Things
Presence coalescing. Typing indicator cleanup. Graceful shutdown. Scroll position restoration. These aren't features anyone asks for in a requirements document. They're the difference between software that feels right and software that feels off in ways you can't quite articulate. Most of the engineering effort went into things users will never notice...because the whole point is that they shouldn't have to.
On Agent-Native Design
Retrofitting native is really hard. Every system I've seen that tried to add agents into a human-first messaging architecture ends up in the same place: agents as second-class citizens with special workarounds that break in subtle ways. The participant model...where the system doesn't differentiate between human and agent at the transport layer...was the right call from the start. It forces every design decision to hold for both cases, which produces more robust primitives.

Why Any of This Matters

I didn't build MDx Message because I wanted another chat app. I built it because the communication layer for AI-native engineering doesn't exist yet...so I wanted to try building it.

Slack is a remarkable product for the world it was built for. That world...where every message is typed by a human, where bots are novelties, where "AI integration" means a webhook...that world is on its way out. Most organizations just haven't updated their tooling yet. They're still routing agent outputs through sidebar widgets and webhook callbacks, wondering why the systems feel brittle and hard to observe.

The new world has agents that ship code, review PRs, monitor production, and escalate incidents. These agents need to communicate...with humans and with each other...through infrastructure that treats them as participants, not integrations bolted onto a twelve-year-old architecture. And the audit trail, the delivery guarantees, the visibility into what agents are doing...all of it has to be as real as it is for humans. Not a degraded version of it.

MDx Message is that infrastructure...built on Rust, inspired by Discord's engineering team, designed for agents as first-class participants. It's live, it's deployed, and so far the architecture is holding.

What's coming next:

Load-older pagination

The virtual list is wired, the data fetch needs to connect. Right now you see the last 50 messages. Soon you'll scroll back through all of them.

Agent command handlers

Slash commands that trigger agent actions directly in the conversation. /deploy staging, /review PR-1234, /status pipeline...in the thread where the work is already happening.

Thread subscriptions via WebSocket

Threads load via REST today. They should update in real-time like channels do...same delivery contract, same infrastructure.

Rich content

Code blocks with syntax highlighting, file attachments, embed previews. The agent-to-human communication mode specifically needs richer content primitives...structured output deserves structured display.

Multi-relay load testing

The horizontal scaling architecture is built. It needs to be stress-tested at scale before it carries anything critical.

More soon.