Some Microsoft Azure customers with workloads running in its South Central US data center are having big problems coming back from the holiday weekend Tuesday, after shutdown procedures were initiated following a spike in temperature inside one of its facilities.
Around 230am Pacific Time, Microsoft identified problems with the cooling systems in one part of its Texas data center complex, which caused a spike in temperature and forced it to shut down equipment in order to prevent a more catastrophic failure, according to the Azure status page. These issues have also caused cascading effects for some Microsoft Office 365 users as well as those who rely on Microsoft Active Directory to log into their accounts.
Engineers are in the process of restoring power to affected data center devices. Resources in South Central and potentially other regions may experience impact. Please refer to your portal, https://t.co/Dw19fIGsXf and/or Twitter for updates. pic.twitter.com/zkkDKqhsG9
— Azure Support (@AzureSupport) September 4, 2018
The cooling system is the most critical part of a modern data center, given the intense heat produced by thousands of servers cranking away in an enclosed area. Most cloud companies have automatic shutdown procedures that are triggered by a sharp rise in temperature, and while that’s a good idea, it requires admins to reboot everything, and that takes time.
Microsoft said it would provide an update on its progress by 10am PT on its Azure status page, which, in accordance with Murphy’s Law, has itself been down at several points this morning. At the moment the main issues seem confined to Texas, but the problems with Active Directory and Visual Studio Team Services — Microsoft’s hosted developer environment — could affect customers in multiple regions.
Update 10:01am: Microsoft extended its deadline for providing an update on the status of the service until 1pm PT. It is still reporting that multiple services are affected by these issues, and Visual Studio Team Services appears to be down across multiple regions, giving developers around the world a snow day.
Update 10:39am: Microsoft’s own services appear to be the most widely used apps and sites affected by these problems, with Xbox Live and OneDrive also experiencing problems at the moment.