Lightstep UI and Data Ingest experienced an outage triggered by a planned database change on July 21st. This database change caused a service failure, which led to a cascading failure in dependent services. Action was taken to restore the service in a degraded mode, which recovered the Lightstep UI. A subsequent roll back of the database change restored all systems and services.
12:22 PM: A database change was run causing a service to become unavailable.
12:25 PM: Cascading failures impact Lightstep UI and ingestion. Ingested data loss begins.
12:35 PM: Incident declared in status page.
12:49 PM: Root cause identified, service brought back in a degraded mode, recovery begins.
13:01 PM: Several systems stabilize, Lightstep UI recovers.
13:17 PM: Database change rolled back. Remaining affected services begin recovering.
13:30 PM: Ingestion recovers. Incident resolved, status page updated to “mitigated and monitoring”.
14:02 PM: Status page updated to fully operational.