Go-live without the downtime

The go-live is the moment the project either earns its fee or doesn’t. We treat it like a controlled experiment, not a deployment. The discipline is in the rehearsal, not the night-of execution.

What “go-live” means in our vocabulary

We use “go-live” deliberately — not “deploy”, not “release”, not “ship”. A go-live is a coordinated switch from one operating state to another, with a known rollback path & a defined verification window. Deploys happen all the time without coordination. Go-lives are scheduled, planned & observed.

A go-live might be: pointing your domain at a new application. Migrating customer records into a new CRM. Promoting a new checkout from staging to production for 100% of traffic. Each is a go-live because it’s a state change with consequences if it goes wrong.

The three rules we don’t break

We’ve delivered enough go-lives to have a short list of rules that stay constant across every project. The three rules below are the ones we don’t compromise on.

Rule one — Go live at low traffic for the operator’s actual customer base

Not a blanket “off-hours” rule. Each operator has a different traffic shape. For an AU B2B SaaS we go live Tuesday 14:30 Sydney — mid-afternoon, mid-week, when customers are at their desks but no critical reports are due. For an ecom site we go live 03:00 Sunday Australia time — middle of the customer’s night, off-peak globally. For a public-sector operator we go live Saturday morning — outside the constituent-facing window.

We profile traffic before scheduling. The “off-hours” rule that goes live at 02:00 every time is convenient for the agency & bad for the customer. A 02:00 go-live means rollback (if needed) happens to a tired engineer. A daylight go-live means alert humans, fast triage & faster rollback.

Rule two — Rollback rehearsed on a copy of your live site

Not “we have a rollback plan” — actually executed, end-to-end, against a copy of the live site the day before the go-live. We provision the copy, simulate the go-live state, then practice rolling back. We time it. If rollback takes more than four minutes, the go-live plan gets revised before we proceed. Sometimes that means changing how the switch is staged. Sometimes it means changing how the data moves across.

The rehearsal catches things you don’t think to plan for. The security certificate that doesn’t update fast enough. The domain setting that takes too long to refresh. The old connection that doesn’t close cleanly on the rollback. We’ve seen each of these break a “we’ll just roll back” plan in a real go-live where they hadn’t been rehearsed.

Rule three — Phone-call escalation path active during the go-live window

Not Slack. Phone. The named operator running the go-live answers within 30 seconds. If they don’t, the named backup does. Both numbers are in the project agreement; both phones are on & unsilenced for the go-live window.

Why phone over Slack? Because go-live decisions are time-critical & need a live answer. “Should we proceed?” needs an answer in seconds, not in the time it takes someone to switch from email to Slack to read the message. Phone is faster than every typed channel.

The go-live document

For every go-live we write a go-live document. Two pages, structured the same way each time.

Page one — Plan

Go-live window with timezone. Pre-flight checklist. Step-by-step sequence with the exact actions to take. Verification steps with explicit pass/fail criteria. Rollback trigger criteria (what conditions abort the go-live). Rollback procedure step-by-step.

Page two — Roles & channels

Named operator running the go-live. Named backup. Operator-side approval gate (one human on the operator team who has to give a “go” before the go-live proceeds). Phone numbers for both sides. The Slack channel for non-urgent updates during the window. The post-go-live check plan with specific URLs & expected responses.

The document goes to the operator 48 hours before the go-live window for sign-off. Any changes flagged in sign-off get re-rehearsed against the copy.

What’s not negotiable

A few constraints we don’t relax for any reason.

We don’t go live without rehearsal. If the operator wants to skip rehearsal because “we’re under time pressure”, we postpone the go-live. The pressure doesn’t change the math — an unrehearsed go-live that fails costs more than the delay would have.

We don’t go live without an operator-side approval gate. Even a named operator on our side shouldn’t unilaterally trigger a state change on the operator’s production. There’s always a human on their team who says “go” before we proceed.

We don’t change the go-live window in the last 24 hours. If something comes up that would change the window, we postpone — we don’t compress.

After the go-live

Forty-eight hours of heightened observation. The named operator stays close to the channels & dashboards. We monitor the metric we set as the starting number & the system-level signals (error rates, response times, backlog depending on the project).

If the metric moves the way we predicted in the go-live plan, we close the go-live. If it doesn’t, the rollback gets considered against a documented decision tree — sometimes the right answer is to hold the new state & triage forward, sometimes the right answer is to roll back & re-run. Either way, the decision is documented.

The takeaway

The go-live playbook is unglamorous because the discipline is in the rehearsal & the documentation, not in the heroics on the night. We’ve never had to do a heroics-on-the-night go-live, because the rehearsal catches what would have been heroics. That’s the whole point.

Go-live without the downtime

What “go-live” means in our vocabulary

The three rules we don’t break

Rule one — Go live at low traffic for the operator’s actual customer base

Rule two — Rollback rehearsed on a copy of your live site

Rule three — Phone-call escalation path active during the go-live window

The go-live document

Page one — Plan

Page two — Roles & channels

What’s not negotiable

After the go-live

The takeaway

More from the active book.

Name the weakest part.
Move the number.

Go-live without the downtime

What “go-live” means in our vocabulary

The three rules we don’t break

Rule one — Go live at low traffic for the operator’s actual customer base

Rule two — Rollback rehearsed on a copy of your live site

Rule three — Phone-call escalation path active during the go-live window

The go-live document

Page one — Plan

Page two — Roles & channels

What’s not negotiable

After the go-live

The takeaway

More from the active book.

Name the weakest part.Move the number.

Name the weakest part.
Move the number.