background texture

Manifesto

Production incidents are weird.

They never happen at a good time.

They never come with enough context.

And somehow the fix is never just “restart the pod.”

Real incidents live inside messy, living systems: feature flags, deploys, dashboards, tribal knowledge, runbooks, half-forgotten Slack threads, weird edge cases, unwritten rules, access controls, and that one engineer who “just knows” how this service behaves under load.

That's the problem.

The tools around on-call were built like the work is clean and linear.
It isn't.

Fixing production issues is not just about detecting an alert or summarizing logs.
It is about navigating the organization around the incident.

You need to know:

  • what changed
  • what can be rolled back
  • which flag should be disabled
  • what "normal" looked like before the deploy
  • who owns this system
  • who needs to approve a risky action
  • which runbook is real vs outdated
  • what happened the last three times this broke
  • who to ask when the agent gets stuck
  • how this team actually debugs in practice, not in theory

That is why Contain exists.

We do not believe the future of incident response is another separate platform your team has to learn, maintain, and babysit.

We believe the right teammate should work where the team already works.

In Slack.
Inside the real workflows.
With the real context.
Connected to the real tools.
Aware of the real people.

Contain is not trying to replace your team with a magic black box.
It is trying to become the teammate that joins the incident, understands the environment, takes safe action, asks smart questions, and gets better every time.

Because the real moat in on-call is not just automation.
It is organizational memory.

Every incident teaches a company something: which fixes are safe, which dashboards matter, which services are brittle, which approvals are needed, which people are helpful, which runbooks can be trusted, and which “temporary” workarounds are actually permanent infrastructure.

Most of that learning disappears.
Contain is built to keep it.

As it helps resolve more incidents, it should become more useful, more context-aware, and more aligned to how your organization actually operates. Not some generic best-practice company. Your company.

That means:

  • Contain should know the shape of your stack.
  • Contain should understand pre- and post-deploy monitoring.
  • Contain should help with rollbacks, feature flags, and controlled actions.
  • Contain should respect permissions and access boundaries.
  • Contain should know when to act, when to ask, and when to escalate.
  • Contain should learn from every incident without turning your team into prompt engineers.

We are building for the reality that production is socio-technical.

The fix is rarely just technical.
It is technical + organizational + historical.

And that is exactly why this can work.

Yes, the infra is hard.
Yes, the integrations are deep.
Yes, access control matters.
Yes, trust has to be earned.

But that is also why this matters.

If you can build a system that understands not just the codebase, but the company around the codebase, you do not get a toy.
You get leverage.

And if that system can safely reduce MTTR, prevent repeat incidents, preserve on-call knowledge, and make every engineer less alone at 2:13 AM - that is not just another devtool.

That is a real teammate.
That is Contain.