Summary
A database migration held a metadata lock for longer than anticipated, blocking other connections to the production database and causing a partial outage of the Tight API. The incident began at approximately 2:19 PM ET on May 19, 2026, and was automatically resolved by 2:32 PM ET, roughly 13 minutes later, without manual intervention.
Contributors
A scheduled schema migration acquired a metadata lock on a production table. The lock was held longer than expected, causing subsequent connections that needed access to the same table to queue and eventually time out. Because the affected table is in the path of common API operations, this surfaced as a partial outage for a subset of users.
Mitigators
Tight's monitoring detected the issue within minutes of impact and alerted the on-call team. Before manual remediation was required, the migration completed on its own and released the metadata lock, after which queued database connections drained and API traffic returned to normal levels. No manual intervention was required to restore service.
Learnings and risks
We have taken or are taking the following actions to prevent this and similar issues from happening in the future:
Lock-wait timeouts on production migrations: Migrations will be configured with shorter lock-wait timeouts so that any migration unable to acquire its required locks promptly will fail fast rather than block live database traffic.
Migration review for high-traffic tables: Schema changes targeting tables on critical API paths will require additional review and pre-flight validation against production-representative data to identify potential lock contention before deployment. We are also expanding the context provided to our AI-assisted code review tooling with information about our database structure and which tables are high-traffic, so that risky schema changes can be flagged automatically at review time.