Software's Final Frontier

As an engineer, your production code is your interface to the outside world.

Your impact is measured by how well that code solves someone’s problem, how big that problem is, and how quickly you can adapt as the problem space changes.

Once a business reaches sufficient scale, the information about what’s actually happening in production becomes distributed across teams and people and systems.

I’m not just talking about the “tacit knowledge in an engineer’s head” thing - yes, that too, but even what lives in systems is fragmented.

Source code, logs and traces, data, metrics, cloud infra, deployment config, documentation, incident history, email and chat messages, and all the idiosyncratic business context that forms the connective tissue.

This is one of the most important unsolved problems in software.

How did we get here?

For one, making the structured and unstructured data in these disparate systems accessible, keeping it in sync, and knowing where to look for the right thing at the right time is hard.

It’s hard because the surface area is massive and businesses are nuanced. The right place to look and the right thing to do when there’s a production outage depends on a lot of other context.

Historically, the solution has been to defer to the experienced engineer. After all, she has years of on-call scar tissue and probably built half the damn system.

We’re approaching an inflection point.

As approximately all code is produced by agents and the gross volume of software shipped to production explodes, the intimacy between the human engineer and their systems will decrease proportionally.

This is a perfectly good tradeoff, but humans are still liable for the quality and reliability of what’s running in production.

But the more we use agents to ship software, the more agent-dependent we’ll need to become to manage, support, and maintain that software.

LLMs don’t have the tacit knowledge of a senior engineer embedded in their weights. We’re now forced to solve the problem of writing it down and making it accessible.

Production needs an authoritative source of system and business context. It needs its S***** of R*****.

This problem is now tractable.

Agents are great at retrieving and reasoning over unstructured data. They're also useful for exploring distributed sources of production information and synthesizing a coherent, reusable map of how things work.

We need connections to everything (yes, everything) that exists in systems and tools for ingesting or creating what doesn’t. And this all needs to be securely exposed in a way that’s legible to agents.

Current observability tools won’t do this, and we’re already seeing the effects of this missing layer at scale (see: AWS outage reportedly caused by an agent making an infra change). There will continue to be many more such cases as we ride the exponential of coding agent adoption in the enterprise.

The software engineering loop won’t be closed until this is solved.

As an engineer, your production code is your interface to the outside world.

Your impact is measured by how well that code solves someone’s problem, how big that problem is, and how quickly you can adapt as the problem space changes.

Once a business reaches sufficient scale, the information about what’s actually happening in production becomes distributed across teams and people and systems.

I’m not just talking about the “tacit knowledge in an engineer’s head” thing - yes, that too, but even what lives in systems is fragmented.

This is one of the most important unsolved problems in software.

How did we get here?

For one, making the structured and unstructured data in these disparate systems accessible, keeping it in sync, and knowing where to look for the right thing at the right time is hard.

It’s hard because the surface area is massive and businesses are nuanced. The right place to look and the right thing to do when there’s a production outage depends on a lot of other context.

Historically, the solution has been to defer to the experienced engineer. After all, she has years of on-call scar tissue and probably built half the damn system.

We’re approaching an inflection point.

This is a perfectly good tradeoff, but humans are still liable for the quality and reliability of what’s running in production.

But the more we use agents to ship software, the more agent-dependent we’ll need to become to manage, support, and maintain that software.

LLMs don’t have the tacit knowledge of a senior engineer embedded in their weights. We’re now forced to solve the problem of writing it down and making it accessible.

Production needs an authoritative source of system and business context. It needs its S***** of R*****.

This problem is now tractable.

The software engineering loop won’t be closed until this is solved.

Software’s Final Frontier

How did we get here?

We’re approaching an inflection point.

This problem is now tractable.

Put this into production.

Related Articles

Why Production Data Engineering is Hard and Where AI Agents Can Help

Building Intuition: Corelayer's Agent Memory System

Observability From First Principles

Software’s Final Frontier

How did we get here?

We’re approaching an inflection point.

This problem is now tractable.

Put this into production.

Related Articles

Why Production Data Engineering is Hard and Where AI Agents Can Help

Building Intuition: Corelayer's Agent Memory System

Observability From First Principles