Application Outage due to database lag Saturday 31st December 2022 00:05:33


We have identified an issue with the way our database server is taking backups which is resulting in connectivity to the database hanging. During this time the website will be non responsive.

Service is fully operational.

A fix has now been rolled out to move the intensive read/write operations from the server running our primary database server onto some of our other infrastructure. We're confident this should resolve the poor performance that has been observed and will be monitoring over the coming 24 hours.

Apologies for the late update on the last comment, we incorrectly posted it on a previous incident. We have one final attempt we will make tonight to restore full performance to the environment and should this not work we will look to roll back the production impacting changes and get some sleep.

We are actively trying to push out a new infrastructure component to take the load off our primary database node, however data replication has been failing and we're working to identify the root cause of this new issue.

We are currently working to deploy a database replica on a new physical host which will allow us to move the cause of the heavy IO Usage onto that node and away from our current primary database node. Once complete we should start to see the site stabilize.