Real time service status from across the ATLAS Media Group portfolio
No incidents reported
No incidents reported
We will be performing essential maintenance at our Redditch Data Center location which will result in the outage of MastodonAppUK and Universeodon - We expect the outage to last between 13:35 UTC and 14:35 UTC. This notice will be updated as appropriate.
All services now restored.
We're running into a lot of issues getting our storage server to re-connect to the network as well as our second app server, until these are fully online we won't be able to get service restored. We're working as fast as we can to bring everything back online. Apologies for the delay.
The changes should now be complete, we're working to get the servers re-connected and operational as the second server is having some issues with network connectivity. Once this is complete the site should be restored. We're working as fast as we can to get everything operational.
Getting the new gateway operational is taking a bit longer than we expected. We're now hoping to get everything restored by 15:00 UTC. Apologies for the delay.
The upgrades to our storage server have completed and all is operational there now. We're monitoring it to ensure it remains stable and will shortly be powering down our gateway to replace it with the new one.
We have taken the time to also complete the cable adjustments while the server was upgrading so we hope that once the gateway switch is complete and everything is back up and running there that we should have all systems operational with relative ease.
We're currently performing a software update to our existing gateway device to ensure the new one and the old one can run the latest firmware and that there are no issues with the move.
Maintenance is now under way and we've shut down our application servers.
The first step now for us is to apply the relevant updates to our underlying storage server, we can't do this while the applications are running easily without risking issues with the VM's access to the underlying storage so we're working through this now. Once that is complete we need to swap out the gateway device which has been causing us significant issues over the last couple of months.
We will later also need to make some cabling changes however this should all be on our APP-2 server and we should be able to get the site operational before we make these changes, it will impact our ability to run content processing but it shouldn't cause further site outages.
No incidents reported
No incidents reported
No incidents reported
No incidents reported
No incidents reported
We lost connectivity to our gateway device overnight. We are waiting for our on site team to physically restart the device. Apologies.
Service health was restored.
The gateway has finally become responsive again, it looks like the CPU and load averages were maxed out causing the processes to crash, we're monitoring the gateway at this time and have restored access to both sites and re-enabled content processing.
I've managed to get some access into the network despite the router itself being pretty much entirely unresponsive. I am hoping to remotely reboot the device as soon as I can gain access via the cluster which will hopefully start to restore functionality.
We have once again lost network access to our gateway device. We are working to see if there is anything we're able to do to restore access at this time.
We've had a lot of issues throughout the day with connectivity between our new DC site and our legacy sites which has resulted in our content processing struggling to process in a timely manner. In the case of MastodonAppUK we're now mostly back on track though had earlier issues where almost all jobs failed and needed to be re-tried which skewed our data. Universeodon.com still has around 75k ingest jobs to run which are slowly being processed but again are having connectivity related issues due to issues with our network hardware and connectivity issues with our legacy environment.
Our network gateway has been restarted by the on-site team. We did run into some issues when the site came back online due to the volume of traffic being generated through the content federation and federation updates, we've scaled back some of the content processing for the time being to try to keep the site running stable.
We have agreed with the hardware vendor to return this model of hardware and will be ordering and replacing it with a more powerful version which we hope will operate with greater stability.
No incidents reported
No incidents reported
No incidents reported
No incidents reported