A Microsoft data center in Cheyenne, Wyo. (Microsoft Photo)
In an increasingly competitive market for cloud computing, reliability matters, and Microsoft has some work to do.
Data compiled by Gartner and Krystallize Technologies shows a noticeable gap between Microsoft Azure and the other two big cloud providers when looking at cloud uptime in North America during 2018. According to Gartner, last year Amazon Web Services and Google had nearly identical uptime statistics for the virtual machines at the heart of cloud services 99.9987 percent and 99.9982 percent, respectively while Azure trailed by a small but significant amount, at 99.9792 percent.
Azure has had significant downtime, not just in 2018, but even the first three months of 2019 have been not good for Microsoft, said Raj Bala, an analyst with Gartner who compiled the data.
As Microsoft courts developers this week at Build with an array of new services, it has also making been making changes behind the scenes to improve Azure reliability, said Mark Russinovich, Microsoft Azure CTO, in an interview this week with GeekWire. He plans to showcase a few of those improvements during his annual Azure architecture keynote on Wednesday, but also defended the companys track record when dealing with planned and unplanned disruptions to cloud service.
Weve invested a ton in capabilities that allow us to do maintenance with little to zero impact on customers, Russinovich said.
However, that didnt help last week when a routine DNS migration went haywire, disconnecting Azure services from customers and causing a major outage that lasted several hours and took out essential Microsoft services like Office 365 and Xbox Live, as well as websites such as the one youre currently visiting.
According to a root-cause analysis released by Microsoft earlier this week, that problem was caused by two separate errors, and had either one of those errors happened by itself, were not having this discussion. As a result, Microsoft is putting additional procedures and safeguards into place in hopes of preventing this from happening again in the future, Russinovich said.
When you do thousands of these and everything goes off fine, youre like, the process works, he said. Obviously something like this shows us that theres a gap, and were closing that gap.
There were two major unplanned events that rocked Microsofts cloud services in North America during 2018.
The discovery of the Meltdown and Spectre chip bugs in 2017 forced all cloud providers to update their services in January 2018 with software mitigations that isolated cloud customers from those bugs, but Microsoft had to reboot everyones servers to put those changes into effect, and that takes time. And in September 2018, a lightning strike at a data center in its South Central U.S. region caused some cooling systems to fail, damaging servers and knocking out some services for more than 24 hours as engineers worked to preserve customer data and replace the damaged systems.
In the months following the Spectre reboot cycle, Microsoft began rolling out new live migration capabilities that allow it to update servers running customer workloads with little to no disruption. Earlier this year it began rolling those features out across its network of data centers, and theyre now operating nearly everywhere, Russinovich said.
But AWS and Google also needed to update their servers to add the patches for Spectre and Meltdown, and it didnt appear to have as much of an impact on their service uptime. Google likes to tout its live migration capabilities that can update servers with no disruption to customer workloads, while AWS talks far less about the technologies it uses to run its cloud service, which is very on brand for the market-share leader.
Microsoft is also using machine-learning technology to do predictive analytics on its data center hardware, Russinovich said, in hopes of flagging components that are about to fail or underperform based on historical performance data.
On Wednesday Russinovich plans to show off Project Tardigrade, a new Azure service named after the nearly indestructible microscopic animals also known as water bears. This effort will detect hardware failures or memory leaks that can lead to operating system crashes just before they occur and freeze virtual machines for a few seconds so the workloads can be moved to a fresh server.
The company is also continuing to roll out availability zones in its cloud computing regions around the world. Microsoft cloud executives rarely miss an opportunity to point out that they have the most regions around the world of any cloud provider, but only within the last year has Microsoft started building availability zones separate facilities within a region with independent power and cooling supplies that help ensure availability in the event of a problem at one building in a region.
Microsoft launched its first availability zones in March 2018 in its Iowa and Paris data centers, and has since rolled them out to several other regions in the U.S., Europe, and Asia. Cloud providers refer to regions and zones a little differently, but AWS and Google Cloud have had far more availability zones up and running for several years.
Operating cloud computing services at scale is really one of the more amazing things human beings have accomplished; the complexity involved is hard to appreciate without a fair amount of knowledge about how these systems work. And even if Microsoft lags AWS and Google in reliability scoring, unless your company is blessed with world-class operations talent, Microsoft is likely still better at operating data centers than most companies managing their own servers.
But turning over control of your most critical business applications to a third-party provider still requires a leap of faith. As cloud companies fight tooth and nail for the next generation of large enterprise customers considering a move to the cloud, uptime numbers will be more and more important.
Microsoft may be all-in on cloud computing, but Azure ...
- What is cloud computing? - Definition from WhatIs.com - March 4th, 2019
- Cloud - Wikipedia - February 19th, 2019
- Cloud computing: A complete guide | IBM - February 7th, 2019
- FusionCloud Full-Stack Private Cloud - Huawei Enterprise - February 4th, 2019
- What is cloud computing? | IBM - January 24th, 2019
- What Is Cloud Computing? | The Basics of Digital Outsourcing - January 22nd, 2019
- Cloud Computing - Yahoo - January 13th, 2019
- Best Sellers in Cloud Computing - amazon.com - January 2nd, 2019
- Cloud Computing Explained by Common Craft (VIDEO) - January 2nd, 2019
- Cloud Computing Trends: 2017 State of the Cloud Survey - December 25th, 2018
- Cloud Computing Overview - tutorialspoint.com - December 25th, 2018
- 15 Top Cloud Computing Service Provider Companies - December 25th, 2018
- Cloud computing: Hardware & Software Security: Online ... - December 23rd, 2018
- Cloud Solutions from Cisco - Cisco - December 23rd, 2018
- Cloud Computing | The MIT Press - December 23rd, 2018
- Learn Cloud Computing Tutorial - javatpoint - December 23rd, 2018
- Standards - IEEE Cloud Computing - December 23rd, 2018
- Benefits of cloud computing | IBM Cloud - November 10th, 2018
- Cloud Computing Trends: 2018 State of the Cloud Survey - November 10th, 2018
- What is cloud computing? - LinkedIn - November 5th, 2018
- What is cloud computing? | TechRadar - September 25th, 2018
- Cloud Computing 2nd Edition: 2018: Mr. Ray Rafaels ... - September 23rd, 2018
- Cloud Computing - Articles & Whitepapers | Oracle Technology ... - September 23rd, 2018
- Cloud Computing: Theory and Practice: Dan C. Marinescu ... - September 23rd, 2018
- Programming Lesson Plan: Program Your Partner - September 5th, 2018
- Cloud Computing | Definition of Cloud Computing by Merriam ... - July 26th, 2018
- Cloud computing information, news and tips ... - April 30th, 2018
- Cloud computing - A simple introduction - Explain that Stuff - March 15th, 2018
- Doug H. - Boston Cloud Computing Meetup (Boston, MA) | Meetup - December 16th, 2017
- Cloud computing at Ifes, IFs, and hospitals | RNP - December 16th, 2017
- Cisco and Google Find Mutual Interest in Cloud Computing ... - October 28th, 2017
- How to Invest in Cloud Computing -- The Motley Fool - October 28th, 2017
- What is cloud computing? Everything you need to know now ... - September 19th, 2017
- How The Automotive Industry Is Leveraging Cloud Computing - CXOToday.com - September 7th, 2017
- Huawei ups its bet on cloud computing with broader support for Microsoft apps - GeekWire - September 7th, 2017
- Cloud computing to drive Billabong's omnichannel experience - Chain Store Age - September 6th, 2017
- Cloud Computing Testbed Chameleon Renewed for Second Phase - HPCwire - September 6th, 2017
- The Software Alliance Advances Discussion on India's Cloud Computing Policy - ETAuto.com - September 6th, 2017
- Assessing Alibaba's Cloud Computing Opportunity - Market Realist - Market Realist - September 2nd, 2017
- 3 No-Brainer Stocks to Buy in Cloud Computing - Motley Fool - September 1st, 2017
- Telecom ponders future amid surging cloud computing popularity - TechTarget (blog) - September 1st, 2017
- Heads in the cloud: banks inch closer to cloud take-up - Risk.net (subscription) - August 31st, 2017
- Walmart Taps Nvidia for Massive Cloud to Take on Amazon - Fortune - August 31st, 2017
- Guest Commentary: Cloud computing tackles emerging cyber threats - Security Systems News - August 31st, 2017
- It's Only the Early Innings for Cloud Computing - Morningstar.com - August 29th, 2017
- What are the key benefits of cloud computing? - Information Age - August 29th, 2017
- VMworld 2017: Everything you need to know about VMware's hybrid cloud strategy - ZDNet - August 29th, 2017
- Saudi Telecom Company creates cloud computing giant - ComputerWeekly.com - August 29th, 2017
- Now with VMware and Pivotal, the Cloud Native Computing Foundation is becoming the hub of enterprise tech - GeekWire - August 29th, 2017
- Cloud Computing | HHS.gov - August 27th, 2017
- Oppo, Vivo plan to move cloud storage to India - Economic Times - August 27th, 2017
- Top 2 aspects of cloud computing you need to consi - Accountingweb.com (blog) - August 27th, 2017
- Biz Cloud Computing - Four States Homepage - August 27th, 2017
- Marketo decides to go all-in on cloud computing, and picks Google as its home - GeekWire - August 27th, 2017
- Cloud Computing Confirmed for Travers | TDN | Thoroughbred Daily ... - Thoroughbred Daily News - August 27th, 2017
- Why 2017 Is The Year To Understand Cloud Computing - Nasdaq - August 23rd, 2017
- Microsoft acquires cloud computing firm Cycle Computing to boost ... - The News Minute - August 23rd, 2017
- The Benefits of Multi-Cloud Computing Architectures for MSPs - MSPmentor - August 23rd, 2017
- VMware shares to surge more than 20% because the Amazon cloud threat is overblown: Analyst - CNBC - August 23rd, 2017
- Goldman Sachs just poured $45 million into a company picking up Amazon's slack in the cloud - Yahoo Finance - August 23rd, 2017
- Cloud Computing confirmed for Travers Stakes 2017 - Horse Racing ... - Horse Racing Nation - August 23rd, 2017
- Cloud computing in focus at e-Commerce forum - Oman Tribune - August 21st, 2017
- World's Largest Open Source Cloud Computing Summit to be Hosted in Sydney - Business Wire (press release) - August 21st, 2017
- AT&T, GE and Oracle offer juiciest cloud salaries, new data reveals - Cloud Tech - August 21st, 2017
- Cycle Computing will make Microsoft Azure more appealing to more enterprises - TechRepublic - August 21st, 2017
- Manage containers in cloud computing to prevent sprawl, cut costs - TechTarget - August 19th, 2017
- Business continuity is the ultimate killer application for cloud - ZDNet - August 19th, 2017
- Thailand urged to opt for cloud computing - The Nation - August 19th, 2017
- Cyberattacks Rain Down on Cloud Computing Infrastructure ... - Bloomberg BNA - August 19th, 2017
- Brown to decide Monday if Cloud Computing runs in the Travers - Horse Racing Nation - August 19th, 2017
- Cloud computing reversal: From 'go away' to 'I can't miss out' - InfoWorld - August 18th, 2017
- Alibaba Stock: Why Cloud Computing Could Be Equivalent to AWS - BNL Finance (press release) (registration) (blog) - August 18th, 2017
- Microsoft Acquires A Cloud Technology Company From Right Under Google And Amazon's Noses - Inc.com - August 18th, 2017
- Alibaba's cloud computing revenue almost doubles - SiliconANGLE News (blog) - August 18th, 2017
- Big Data and Cloud Computing Software, Platforms, and Infrastructure 2017 - 2022 - Markets Insider - August 18th, 2017
- Microsoft acquires cloud-computing orchestration vendor Cycle Computing - ZDNet - August 16th, 2017
- Cloud computing decision guide: Breaking down 7 top solutions for healthcare - Healthcare IT News - August 16th, 2017
- Amazon: Earnings Are Not The Holy Grail - Seeking Alpha - August 16th, 2017
- Notes: Cloud Computing still in running for Travers - Albany Times Union - August 14th, 2017
- Assessing the key reasons behind a multi-cloud strategy - Cloud Tech - August 14th, 2017