Wednesday 28th October 2020

Freshop Admin for Web API outage 11:05 AM till 12:00 PM (55 minutes)

[RESOLVED] - Root Cause Analysis (RCA) - We had 2 incidents one after another which caused the outage this AM from 11:05 AM till 12:00 PM (55 minutes). There was a database locking incident with a core table (service_provider_configurations) incident 11:05 AM. This caused database queries to lock up and not return. Investigation was started at 11:12 AM. We discovered the lock and released it, but the second incident manifested as API servers which read/write data from database servers becoming unreachable. Multiple attempts to gain access to the servers failed and we launched a new fleet of API servers at 11:45 AM. The new servers came online and stabilized by 12:00 PM.

[INITIAL] - We are experiencing higher than normal API latency on one of the database segments and we are working on it.