Category Archives: Cloud Servers
A rack of servers is now being used for production loads in what looks like a liquid bath.
While immersion has existed in the industry for a few years now, Vole claims it's "the first cloud provider that is running two-phase immersion cooling in a production environment".
The cooling works by completely submerging server racks in a specially designed non-conductive fluid. The fluorocarbon-based liquid works by removing heat as it directly hits components and the fluid reaches a lower boiling point (122 degrees Fahrenheit or 50 degrees Celsius) to condense and fall back into the bath as a raining liquid.
This creates a closed-loop cooling system, reducing costs as no energy is needed to move the liquid around the tank, and no chiller is needed for the condenser either.
Voles data centre advanced development group vice-president Christian Belady told The Verge: "The rack will lie down inside that bath tub, and what you'll see is boiling just like you'd see boiling in your pot. The boiling in your pot is at 100 degrees Celsius, and in this case, it's at 50 degrees Celsius."
Just so long as the server does not try to put its feet up the taps it should be ok.
See more here:
Microsoft submerges cloud servers in liquid - Fudzilla
The OVHCloud fire: Assessing the after-effects on datacentre operators and cloud users – ComputerWeekly.com
The OVHCloud datacentre campus fire in Strasbourg, France, sent shockwaves through the hyperscale cloud community when it happened in early March 2021, but the industry-wide after-effects of the event could be transformational. In terms of addressing shortcomings in enterprise attitudes towards cloud backups and disaster recovery, while also changing the way that datacentre operators worldwide approach fire suppression.
The fire occurred in the early hours of Wednesday 10 March 2021, with the firms five-story SBG2 datacentre destroyed outright during the blaze, while another facility dubbed SBG1 incurred some damage. Two other datacentres at the site known as SBG3 and SBG4 were switched off as a post-fire precaution and were reportedly undamaged by the incident.
Even so, OVHCloud customers across Europe were affected by service interruptions and downtime by the incident, and in the weeks that have followed the firm has been racing to bring their applications and workloads back online again.
These efforts have included embarking on a widescale clean-up of the datacentre campus, but simultaneously the firm has been drawing on the fact it builds all its own servers in-house to rapidly replace the server capacity destroyed during the fire.
The company operates 15 datacentres in Europe, and also moved to make any spare capacity within these sites available to affected customers as well. At the time of writing, OVHClouds service status page for the Strasbourg facility stated that it is still in the throes of rolling out replacement server capacity at alternative datacentre locations for customers who had workloads housed in SBG2 and the partially destroyed parts of SBG1.
Both facilities housed a mix of public cloud, bare metal and virtual private services (VPS), with the company confirming that 80% of the public cloud-hosted virtual machines these datacentres hosted are back online, as of Tuesday 6 April 2021. Meanwhile, 25% of its bare metal services have been restored, and 34% of its bare metal-based VPS service are also back online.
In SBG1 specifically, 35% of the bare metal cloud servers were back online as of Tuesday 6 April 2021, the companys service status site confirmed, with OVHCloud stating its hope to have 95% of services back in action by the end of this week.
The update further confirmed that SBG4 and SBG3 are operating at 99% availability for customers.
In a video update, posted on 22 March 2021, OVHCloud founder and chairman Octave Klaba shared details of the how efforts to restore services for affected customers were progressing, but also confirmed the root cause of the fire is still the subject of an ongoing investigation that is set to run for a while yet.
The investigation is ongoing, he said, and involves law enforcement, insurance personnel and other assorted financial experts. It will take a few months to have the conclusion of this investigation, and once we have it all, well share it with you.
Initial reports in the wake of the event, however, have suggested the onset of the blaze may have been linked to work carried out on an Uninterruptible Power Supply (UPS) at the site on the day leading up to the fire.
Early indicators point to the failure of a UPS, causing a fire that spread quickly, said Andy Lawrence, executive director of research at the datacentre resiliency think tank, the Uptime Institute, in a March 2021 blog post.At least one of the UPSs had been extensively worked on earlier in the day, suggesting maintenance issues may have been a main contributor.
Although there is no way of knowing for sure at this point, it is possible the UPS in question may have been deployed next to a battery cabinet that may have overheated and caused a fire, offered Lawrence.
Although it is not best practice, battery cabinets (when using vent-regulated lead acid or VRLA batteries) are often installed next to the UPS units themselves, he wrote. This may not have been the case at SBG2, [but] this type of configuration can create a situation where a UPS fire heats up batteries until they start to burn and can cause fire to spread rapidly.
While the investigation into the cause of the fire continues, Klaba said during the video update that the company is committed to using the incident to develop new industry standards, setting out how best to tackle fires within datacentres.
Presently, best practice techniques and standards for fire detection, suppression and extinguishment within datacentres vary according to the location of the datacentre itself, but also what type of equipment is deployed in each room, he said.
[There are] different kinds of fire [extinguishment techniques] for an electrical fire and a different kind for a fire coming from the servers. Whatever the standard is we [have] decided to over secure all our datacentres, said Klaba.
In addition to this, he continued, OVHCloud has set itself a goal of creating a fire testing laboratory, within which the firm will test how fires progress within different datacentre settings, and has committed to sharing the findings from that work with the wider industry.
We decided to create a lab where I want to test. I want to see how the fire is going in the different kinds of the rooms, and to find the best way to extinguish the fire in all kinds of these situations. I want to also to share the conclusion that we will have in this lab with all industry, he said.
Because we we dont want to have this kind of the incident in our datacentre, but also nobody wants to have this kind of an incident in [their] datacentre at all, and the industry has to evolve, and to evolve their standards.
Datacentre fires are a mercifully rare occurrence in the datacentre industry, but that does not stop them being anything less than a constant concern for operators, stated the Uptime Institutes Lawrence in an April 2021 blog post about the frequency of such incidents.
Uptime Institutes database of abnormal incidents, which documents over 8,000 incidents shared by members since its inception in 1994, records 11 fires in datacentres less than 0.5 per year, wrote Lawrence. All of these were successfully contained, causing minimal damage and disruption.
Lawrence goes on to share an observation in the post that it tends to be the systems put in place to suppress fires that tend to do more damage than actual fires in datacentres.
In recent years, accidental discharge of fire suppression systems, especially high pressure clean agent gas systems, has actually caused significantly more series disruption than fires, with some banking and financial trading datacentres affected by this issue, wrote Lawrence.
He also offers operators some fire prevention advice, in terms of the steps they should take to ensure the relatively low incidence of fires reported in the sector continues.
Responsibility for fire regulation is covered by the local authority having jurisdiction, and requirements are usually strict, but rules may be stricter for newer facilities, so good operational management is critical for older datacentres, he said.
Uptime Institute advises that all datacentres use very early smoke detection apparatus systems and maintain appropriate fire barriers and separation of systems. Well-maintained water sprinkler or low-pressure clean agent fire suppression systems are preferred. Risk assessments primarily aimed at reducing the likelihood of outages will also pick up obvious issues with these systems.
While the OVHCloud datacentre fire can serve as a cautionary tale for other operators about how to avoid their facilities befalling a similar fate, what about the firms customers who have experienced a prolonged period of service disruption as a result of the incident? What lessons can they learn from all this?
According to Christophe Bertrand, senior analyst at TechTarget-owned Enterprise Strategy Group, the number one lesson that enterprises need to learn from this incident regardless of whether they are an OVHCloud customer or not is the importance of backing up their data.
Whatever you do as a business, you are always responsible for your data. From a compliance and governance standpoint, you as a business are responsible for securing the ability to recover your own data, he told Computer Weekly.
Just because you have placed data with a third-party software as a service (SaaS) or cloud infrastructure provider, youre still responsible for your data, said Bertrand. If something happens, and anything could happen, on your premises or with the cloud service you use, you should always be in a position to recover your data.
What we have [with OVHCloud] is possibly a situation where maybe people thought, because it was with a third-party provider, it was automatically protected and backed-up, he said. [So] tough luck, because the data is your data and its on you as a business if you dont have a backup somewhere else.
For some of the firms affected by the fire, the lack of backup could be fatal, said Bertrand. I really feel for the small companies that were affected by it, because [the fire] is certainly not their fault, but if they didnt have a backup that was strategically thought through and placed somewhere where they could recover their data, then they made a mistake. And it maybe fatal one. I think some businesses will close based on that.
They may also now incur some additional issues as well, he said. They have a liability to their end users, or maybe some business partners, and maybe some compliance exposures to? Compliance exposures, for sure, because youre not really supposed to lose data.
A common misconception that IT buyers often have about cloud is that they mistake the fact their data is accessible from anywhere as proof that it is backed-up and will always be available in the event of an outage, said Bertrand.
My research shows this big disconnect in terms of protection of data thats in cloud environments because somehow people conflate availability with protection, he said.
OVHClouds Klaba made a similar observation during one of his post-fire video updates, where he made a public commitment to provide the firms customers with free data backups in future as standard, rather than as a paid-for add-on.
It seems globally, the customers understand what we are delivering, but some customers dont understand exactly what they have bought, so we dont want to jump into this discussion by saying we will explain better what we are delivering. What we are doing is we will increase security, and we will deliver the higher security of backups for all customers in different datacentres, he said.
And, in OVHClouds Klabas view, this could lead other cloud firms to follow suit in due course. This incident will change our way of delivering the services, but I believe it will also change the standards of the industry and the market, he said, in a video update to customers dated 16 March 2021.
Jon Healy, operations director at datacentre management services provider Keysource, said the entire incident serves to reinforce why disaster recovery is something neither datacentre operators nor cloud users can afford to overlook.
One hundred percent service availability is an expected standard today but putting this in place for some requires comprehensive planning and can have both technical and commercial implications which need to be considered in order for it to be effective, he said.
Given the average lifespan on a datacentre, there is every chance that while fires might be scarce now that could change in the future.
Given the exponential increase in facilities built in the early noughties, the core infrastructure reaches end of life in 10-to-20 years, and the capital investment to replace or upgrade remains high, will we see more events like this and what will this mean for the industry?
One area that ESGs Bertrand and others have commended OVHCloud on is the transparency and openness of its communications with customers in the wake of the fire, which have included regular video updates from Klaba, as well as daily despatches on the situation via his Twitter feed and service status updates from the company directly from its web pages.
They seem to have been very transparent, communications-wise, which is a real sign of maturity, he said. There is probably only so much they can share, and they have to be cautious because of this process in place to figure out what happened, but you dont get the sense that theyre hiding anything.
Fresh from Intels launch of the companys latest third-generation Xeon Scalable Ice Lake processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips.
And though arch-rival AMD may have won the first round of the latest global chip fight by unveiling its latestnext-generation Epyc server chipsthree weeks before Intels products back on March 15, Intel is apparently not giving up any ground in the fight.
Intel is touting thenew Xeon Scalable chips as having performance that is up to 46 percent better than the companys previous generation of chips, along with major performance improvements in security, flexibility and more.
TheIce Lake processors are 10nm chips that include up to 40 cores per processor, up from 28 cores in the previous generation Cascade Lake chips. Supporting up to 6 terabytes of system memory per socket, the chips provide eight channels of DDR4-3200 memory per socket and up to 64 lanes of PCIe Gen4 per socket, compared to eight channels of DDR4-2933 and up to 48 lanes of PCI Gen3 per socket for the previous chips. The new chips also include features such as Intel Software Guard Extensions (SGX), Intel Total Memory Encryption (TME) and Intel Speed Select Technology (SST), and are compatible with the latest version of Intel Optane persistent memory modules (PMem). PCIe Gen4 architecture provides throughput at twice the speed of the earlier PCI Gen3 specification.
Rob Enderle, principal analyst with Enderle Group, toldEnterpriseAIthat the launch of Ice Lake is arguably one of the most critical launches this decade for Intel because of previous delays in getting these chips to market. According to OEMs, Intels inability to advance their process technology and remain competitive put them arguably two years behind AMD, said Enderle.
The Ice Lake family includes 56 SKUs, grouped across 10 segments: 13 are optimized for highest per-core scalable performance (8 to 40 cores, 140-270 watts), 10 for scalable performance (8 to 32 cores, 105-205 watts), 15 target four- and eight- socket (18 to 28 cores, 150-250 watts), and there are three single-socket optimized parts (24 to 36 cores, 185-225 watts). There are also SKUs optimized for cloud, networking, media and other workloads. All but four SKUs support Intel Optane Persistent Memory 200 series technology.
In comparison, AMDs recently announced Epyc Milan CPUswill be available in 19 SKUs, from a flagship 64-core version to 8-core versions built for a myriad of server workloads. For AI users, the big AMD news was that the latest generation of AMD server chips show promise in improving performance for many AI processes, according to the company.
For Intels partners, the bolstered specifications and capabilities in the Ice Lake chips are what they wanted to offer fresh and more robust and powerful servers to customers who have ever-increasing compute workloads.
While Cisco, Dell EMC, HPE and Lenovo are the first Intel server partners to announce their new hardware at the launch of the Ice Lake chips, other server partners are expected to announce their own boosted server line-ups soon as well.
Heres a rundown of the first Ice Lake-equipped server products:
Cisco Unveils Three Server Models
Cisco begins its Ice Lake transformation with three new Unified Computing System (UCS) server models that incorporate the new CPUs the Cisco UCS B200 M6, C220 M6, and C240 M6 servers, built for todays hybrid and diverse computing environments.
Highlighting the latest UCS servers is native integration with the Cisco Intersight hybrid cloud operations platform, which aims to make it easier for customers to manage their infrastructure wherever it is located through a policy-based system.
With the new capabilities built into the latest Intel chips, the new Cisco servers will be able to fill a wide range of workloads for customers, including Virtual Desktop Infrastructure (VDI), databases, AI and machine learning, big data and more, according to Cisco. The new servers are expected to be generally available within about 90 days.
For over twelve years, Cisco and Intel have been committed to pushing the boundaries in the server market, together delivering many industry-leading innovations, DD Dasgupta, vice president of product management for the Cisco cloud and compute business unit, said in a statement. Todays announcement continues this tradition, and it could not come at a more crucial time. As customers hybrid cloud journeys accelerate, the need for simple yet powerful solutions increase. Cisco and Intel are proud to deliver solutions that not only meet the demands of todays workloads but provide the foundations necessary to embrace new and emerging technologies.
Dell EMC Reiterates Its Ice Lake Plans
Although Intel is officially debuting its new chips today, Dell actually got a leg up on the competition by announcing its plans for its first Ice Lake-equipped servers back on March 17, right after the latest AMD Epyc chips were unveiled.
Thats when Dell unveiled its PowerEdge R750 server, as well as its PowerEdge R750xa, which the company said is purpose-built to boost acceleration performance for machine learning training, inferencing and AI. The PowerEdge R750xa is a dual socket, 2U server that supports up to four double-wide GPUs and six single-wide GPUs. Other Dell server models using the new Intel chips are the C6520, the MX750 and the R750, according to the company. The servers are expected to be available globally in May 2021. Several other models, including the Dell EMC PowerEdge R750xs, the R650xs, the R550, the R450 and the ruggedized PowerEdge XR11 and XR12, are expected to be available in the second quarter of 2021.
Dell Technologies is focused on helping businesses benefit from emerging technologies and innovations that will help them reach their goals faster, Rajesh Pohani, the companys vice president of server product management, said in a statement. Through our close collaboration with Intel, Dell EMC PowerEdge servers deliver better performance and security than ever before, putting customers on a path to autonomous infrastructure that will make IT simpler, more powerful and serve as the innovation engine for moving businesses forward.
HPE Unveils Eight Ice Lake Server Models
At HPE, Intels latest Gen 3 chips are being integrated across eight server lines. This includes the HPE ProLiant DL360 Gen10 Plus, HPE ProLiant DL380 Gen10 Plus and HPE ProLiant DL110 Gen10 Plus standard servers, as well as the HPE Synergy 480 Gen10 Plus server line, which is built for compostable, software-defined infrastructure for hybrid cloud environments.
Also getting the new chips are the HPE Edgeline EL8000 Converged Edge systems and the HPE Edgeline EL8000T Converged Edge systems, which are ruggedized systems that are built for extreme edge use cases.
In HPEs high performance computing (HPC) and AI server lines, the HPE Apollo 2000 Gen10 Plus systems built for HPC workloads like modeling, simulations and deep learning, as well as AI modeling and training and the HPE Cray EX supercomputer lineup are also getting Ice Lake CPUs.
Four New Lenovo ThinkSystem Servers with Ice Lake CPUs
Built for customer workloads in HPC, AI, modeling and simulation, cloud, VDI, advanced analytics and more, Lenovo debuted four new ThinkSystem Server models that incorporate many of the advancements of the latest Intel Ice Lake chips.
The four new server models are the ThinkSystem SR650 V2, the SR630 V2, the ST650 V2 and the SN550 V2, which can be configured in a myriad of ways to meet business demands:
The ThinkSystem SR650 V2 is a 2U, two-socket server aimed at customers from SMBs to large enterprises and managed cloud service providers, providing speed and expansion along with flexible storage and I/O for business-critical workloads. The systems use Intel Optane persistent memory 200 series and include support for faster PCIe Gen4 networking.
The ThinkSystem SR630 V2 is a 1U, two-socket server that includes optimized performance and density for hybrid data center workloads such as cloud, virtualization, analytics, computing and gaming.
The ThinkSystem ST650 V2 is a new two-socket mainstream tower server that uses a slimmer 4U chassis to make it easier and more flexible to deploy in remote offices or branch offices (ROBO), technology or retail locations.
The ThinkSystem SN550 V2 is part of the HPE Flex System family. Designed for enterprise performance and flexibility in a compact footprint, the SN500 V2 is a blade server node that is optimized for performance, efficiency and security for a wide range of business-critical workloads including cloud, server virtualization, databases and VDI.
Later in 2021, Lenovo expects to bring Intels latest Ice Lake processors to its edge computing server line with the introduction of a new highly ruggedized, edge server designed to handle the extreme performance and environmental conditions needed for telecommunications, manufacturing and smarter cities use cases. More details will be announced later in the year.
Intel Again Takes Charge: Analysts
Despite Intels earlier delays getting these Ice Lake chips to market and to its partners, the company remains the dominant vendor in the world of CPUs, said analyst Enderle.
Unlike AMD, which needs a sizeable competitive edge to displace Intel, all Intel needs is to be good enough to hold on to its base, he said. Ice Lake is a forklift upgrade, meaning you cant just replace an older Intel processor with it; youll likely need to replace the server.OEMs generally prefer a complete product replacement over a parts upgrade because they are far more lucrative.
For customers, thats not as appreciated because of disruptions due to server replacements as well as higher related costs, said Enderle. As a result, this is unlikely to force a competitive replacement of newer AMD servers, he said. Those companies preferring performance over all else may still prefer AMD Epyc over Intels latest. But shops wanting to remain homogenous with aging servers will appreciate the extra performance Ice Lake brought and were unlikely to embrace AMD anyway.
Where Intel really gains over AMD is in Intels stronger control over manufacturing, which should also help the company during raw materials shortages, further offsetting their disadvantages, said Enderle. While I doubt Ice Lake is strong enough to reverse the erosion of Intels base to AMD, it should slow it and give Intel time to bring out their next generation, which should be far more competitive.
Karl Freund, founder and principal HPC, AI and machine learning analyst withCambrian AI Research, agrees.
Intel has demonstrated the companys broad spectrum of technology prowess and leadership in this announcement, from CPUs to memory to encryption and networking, said Freund. AMD still enjoys hard-earned leadership in many CPU metrics including performance per core and per socket, but on most other features such as AI performance, Intel clearly has the lead.
This article originally appeared on sister website EnterpriseAI.news.
Cybersecurity And Physical Security – Your Organization Needs To Focus On Both | Security News – SecurityInformed
Human beings have a long-standing relationship with privacy and security. For centuries, weve locked our doors, held close our most precious possessions, and been wary of the threats posed by thieves. As time has gone on, our relationship with security has become more complicated as weve now got much more to be protective of. As technological advancements in security have got smarter and stronger, so have those looking to compromise it.
Cybersecurity, however, is still incredibly new to humans when we look at the long relationship that we have with security in general. As much as we understand the basics, such as keeping our passwords secure and storing data in safe places, our understanding of cybersecurity as a whole is complicated and so is our understanding of the threats that it protects against.
However, the relationship between physical security and cybersecurity is often interlinked. Business leaders may find themselves weighing up the different risks to the physical security of their business. As a result, they implement CCTV into the office space, and alarms are placed on doors to help repel intruders.
But what happens when the data that is collected from such security devices is also at risk of being stolen, and you dont have to break through the front door of an office to get it? The answer is that your physical security can lose its power to keep your business safe if your cybersecurity is weak.
As a result, cybersecurity is incredibly important to empower your physical security. Weve seen the risks posed by cybersecurity hacks in recent news. Video security company Verkada recently suffered a security breach as malicious attackers obtained access to the contents of many of its live camera feeds, and a recent report by the UK government says two in five UK firms experienced cyberattacks in 2020.
Cloud stores information in data centres located anywhere in the world, and is maintained by a third party
Cloud computing offers a solution. The cloud stores your information in data centres located anywhere in the world and is maintained by a third party, such as Claranet. As the data sits on hosted servers, its easily accessible while not being at risk of being stolen through your physical device.
Heres why cloud computing can help to ensure that your physical security and the data it holds arent compromised.
Its completely normal to speculate whether your data is safe when its stored within a cloud infrastructure. As we are effectively outsourcing our security by storing our important files on servers we have no control over - and, in some cases, limited understanding of - its natural to worry about how vulnerable this is to cyber-attacks.
The reality is, the data that you save on the cloud is likely to be a lot safer than that which you store on your device. Cyber hackers can try and trick you into clicking on links that deploy malware or pose as a help desk trying to fix your machine. As a result, they can access your device and if this is where youre storing important security data, then it is vulnerable.
Cloud service providers offer security that is a lot stronger than the software in the personal computer
Cloud service providers offer security that is a lot stronger than the software that is likely in place on your personal computer. Hyperscalers such as Microsoft and Amazon Web Service (AWS) are able to hire countless more security experts than any individual company - save the corporate behemoth - could afford.
These major platform owners have culpability for thousands of customers on their cloud and are constantly working to enhance the security of their platforms. The security provided by cloud service providers such as Claranet is an extension of these capabilities.
Cloud servers are located in remote locations that workers dont have access to. They are also encrypted, which is the process of converting information or data into code to prevent unauthorized access.
Additionally, cloud infrastructure providers like ourselves look to regularly update your security to protect against viruses and malware, leaving you free to get on with your work without any niggling worries about your data being at risk from hackers.
Cloud providers provide sophisticated security measures and solutions in the form of firewalls and AI
Additionally, cloud providers are also able to provide sophisticated security measures and solutions in the form of firewalls and artificial intelligence, as well as data redundancy, where the same piece of data is held within several separate data centres.
This is effectively super-strong backup and recovery, meaning that if a server goes down, you can access your files from a backup server.
By storing the data gathered by your physical security in the cloud, you're not just significantly reducing the risk of cyber-attacks, but also protecting it from physical threats such as damage in the event of a fire or flood.
Rather than viewing your physical and cybersecurity as two different entities, treat them as part of one system: if one is compromised, the other is also at risk. They should work in tandem to keep your whole organization secure.
A dedicated server has many advantages for your company and can be a perfect way to increase your confidentiality. An unmetered dedicated server is a form of outsourcing in which a client rents a server that might not be shared with any other customers. This ensures they have full control over the server and do not have to worry about their data being hacked or their server slowing down due to multiple users. Aside from the above, they can also select which web browser and hardware to use.
The following are the reasons to use a dedicated server:
A dedicated server, as compared to a cloud server, is the one that is developed to assist you and only you are. You may not have to worry about other users using the servers services because the server could notbe exploited by someone you do not give access to. This means that your server will work better and the programs will run much faster than if they were hosted on a cloud or SaaS server.
Cloud servers are intended to exchange information and have more storage capacity than dedicated servers. Cloud servers do not have powerful or fast reserves; rather, they are primarily concerned with storage. If you need storage, a cloud server is a way to go, but if you need quality and efficiency, a dedicated server is a way to go.
Dedicated servers are undeniably more versatile than other options, and that is the versatility that distinguishes them. A dedicated server, unlike other servers, can create different applications simultaneously. Dedicated servers are most commonly used for web hosting and e-commerce, but they can also be included in VPNs, email servers, CVS (custom virtual setups), and data storage. Dedicated servers are extremely flexible and should be preferred over some of the other options available. Because of the simplicity of a dedicated server, you can use it for almost anything, making it a much better choice when combined with its other advantages.
Dedicated servers are by far the most cost-effective servers available, giving you even more value for money. Cloud servers will appear to be inexpensive at first glance, but when you need more memory, the cost will increase, while a dedicated server can have more advantages and can support more clients at once than a cloud server. A dedicated cloud provider would be able to keep things running smoothly, from the network to the hardware. Dedicated servers are rented, which ensures that if anything goes wrong, the vendor will be kept liable and you will not be fined.
When purchasing a dedicated server, you have the option of customizing the hardware to your specifications. All hardware is configurable, whether it is more RAM, more hard disk space, or a faster CPU. Furthermore, if you need additional valuable resources, you can approach your server provider and request an update to accommodate your evolving business needs.
China is pushing forward an internet society where economic and public activities increasingly take place online. In the process, troves of citizen and government data get transferred to cloud servers, raising concerns over information security. One startup called ThreatBook sees an opportunity in this revolution and pledges to protect corporations and bureaucracies against malicious cyberattacks.
Antivirus and security software has been around in China for several decades, but until recently, enterprises were procuring them simply to meet compliance requests, Xue Feng, founder and CEO of six-year-old ThreatBook, told TechCrunch in an interview.
Starting around 2014, internet accessibility began to expand rapidly in China, ushering in an explosion of data. Information previously stored in physical servers was moving to the cloud. Companies realized that a cyberattack could result in a substantial financial loss and started to pay serious attention to security solutions.
In the meantime, cyberspace is emerging as a battlefield where competition between states plays out. Malicious actors may target a countrys critical digital infrastructure or steal key research from a university database.
The amount of cyberattacks between countries is reflective of their geopolitical relationships, observed Xue, who oversaw information security at Amazon China before founding ThreatBook. Previously, he was the director of internet security at Microsoft in China.
If two countries are allies, they are less likely to attack one another. China has a very special position in geopolitics. Besides its tensions with the other superpowers, cyberattacks from smaller, nearby countries are also common.
Like other emerging SaaS companies, ThreatBook sells software and charges a subscription fee for annual services. More than 80% of its current customers are big corporations in finance, energy, the internet industry and manufacturing. Government contracts make up a smaller slice. With its Series E funding round that closed 500 million yuan ($76 million) in March, ThreatBook boosted its total capital raised to over 1 billion yuan from investors, including Hillhouse Capital.
Xue declined to disclose the companys revenues or valuation but said 95% of the firms customers have chosen to renew their annual subscriptions. He added that the company has met the preliminary requirements of the Shanghai Exchanges STAR board, Chinas equivalent to Nasdaq, and will go public when the conditions are ripe.
It takes our peers 7-10 years to go public, said Xue.
ThreatBook compares itself to CrowdStrike from Silicon Valley, which filed to go public in 2019, and detects threats by monitoring a companys endpoints, which could be an employees laptop and mobile devices that connect to the internal network from outside the corporate firewall.
ThreatBook similarly has a suite of software that goes onto the devices of a companys employees, automatically detects threats and comes up with a list of solutions.
Its like installing a lot of security cameras inside a company, said Xue. But the thing that matters is what we tell customers after we capture issues.
SaaS providers in China are still in the phase of educating the market and lobbying enterprises to pay. Of the 3,000 companies that ThreatBook serves, only 300 are paying, so there is plenty of room for monetization. Willingness to spend also differs across sectors, with financial institutions happy to shell out several million yuan ($1 = 6.54 yuan) a year while a tech startup may only want to pay a fraction of that.
Xues vision is to take ThreatBook global. The company had plans to expand overseas last year but was held back by the COVID-19 pandemic.
Weve had a handful of inquiries from companies in Southeast Asia and the Middle East. There may even be room for us in markets with mature [cybersecurity companies] like Europe and North America, said Xue. As long as we are able to offer differentiation, a customer may still consider us even if it has an existing security solution.
Amazon Web Services, the juggernaut of cloud computing, may be forging its own path with Arm-based CPUs and associated DPUs thanks to its 2015 acquisition of Annapurna Labs for $350 million. But for the foreseeable future, it will have to offer X86 processors, probably from both Intel and AMD, because these are the chips that most IT shops in the world have most of their applications running upon.
We talked about that, and how AWS will be able to charge a premium for that X86 compute at some point in the future, in a recent analysis of its Graviton2 instances and how they compare to its X86 instances. Other cloud providers will follow suit. We already know that in China, Tencent and Alibaba are eager about Arm-based servers, and so is Microsoft, which has a huge cloud presence in North America and Europe.
There is no such explicit need to support a particular switch or routing ASIC for the sake of cloud customers as there is for CPUs. And that is why we believe that AWS might actually be considering making its own switch ASICs, as has been rumored. As we detailed way back when The Next Platform was established, AWS has been building custom servers and switches for a very long time, and it has been concerned about its supply chain of parts as well as vertical integration of its stack for the past decade. And we said six years ago we would not be surprised if all of the hyperscalers eventually took absolute control of those parts of its semiconductor usage that it could for internal use. Any semiconductor that ends up being a part of back-end infrastructure that cloud users never see, or part of a platform service or software subscription that customers never touch, can be done with homegrown ASICs. And we fully expect for this to happen at AWS, Microsoft, Google, and Facebook. And Alibaba, Tencent, and Baidu, too. And other cloud suppliers that are big enough elsewhere in the world.
This is certainly true of switch and router chippery. Network silicon is largely invisible to those who buy infrastructure services (and indeed anyone who buys any platform services that ride above the infrastructure services), and in fact, the network itself is largely invisible to them. Here is an example of how invisible it is. A few years back when we were visiting the Microsoft region in Quincy, Washington, we asked Corey Sanders, the corporate vice president in charge of Azure compute, about the aggregate bandwidth of the Microsoft network underpinning Azure. You know, I honestly dont know and I dont care, Sanders told us. It just appears infinite.
The point is, whatever pushing and shoving is going on with AWS and Broadcom, it will never manifest itself as something that customers see or care about. This is really about two hard-nosed companies butting heads, and whatever engineering decisions have been already made and will be made in the future will have as much to do with ego as feeds and speeds.
There is a lot of chatter about the hyperscalers, so lets start with the obvious. All of these companies have always hated any closed-box appliance that they cannot tear the covers off, rip apart, and massively customize for their own unique needs and scale. This is absolutely correct behavior. The hyperscalers and largest public clouds hit performance and scale barriers that most companies on Earth (as well as those orbiting Rigel and Sirius) will never, ever hit. Thats their need, not just their pride. The hyperscalers and biggest cloud builders have problems that the silicon suppliers and their OEMs and ODMs havent thought about, much less solved. Moreover, they cant move at Cisco Systems speed, which is find a problem and take 18 to 24 months to get a feature into the next generation ASIC. This is why software defined networking and programmable switches matter to them.
Ultimately, these companies fought for disaggregated switching and routing to drive down the price of hardware and to allow them to move their own network switching and routing software stacks onto a wider variety of hardware. That way, they can grind ASIC suppliers and OEMs and now ODMs against each other. The reason is simple. Network costs were exploding. James Hamilton, the distinguished engineer at AWS who helps fashion much of its homegrown infrastructure, explained this all back in late 2014 at the re:Invent conference, which was five years after the cloud giant had started designing its own switches and routers and building its own global backbone, something that Hamilton talked about back in 2010 as this effort was just getting under way.
Networking is a red alert situation for us right now, Hamilton explained in his keynote address at Re:Invent 2014. The cost of networking is escalating relative to the cost of all other equipment. It is Anti-Moore. All of our gear is going down in cost, and we are dropping prices, and networking is going the wrong way. That is a super-big problem, and I like to look out a few years, and I am seeing that the size of the networking problem is getting worse constantly. At the same time that networking is going Anti-Moore, the ratio of networking to compute is going up.
The timing is interesting. That was after AWS had embraced the merchant silicon for switch and routing ASICs from Broadcom, and it was six months before Avago, a semiconductor conglomerate run by Hock Tan, one of the richest people in the IT sector, shelled out a whopping $37 billion to buy semiconductor maker Broadcom and to take its name.
You dont build the worlds largest e-commerce company out of the worlds largest online bookseller and then create an IT division spinout that becomes the worlds largest IT infrastructure supplier by being a wimp, and Jeff Bezos is certainly not that. And neither is Tan, by all indications. And thats why we think, looking at this from outside of a black box, AWS and the new Broadcom have been pushing and shoving for quite some time. And this is probably equally true of all of the hyperscalers and big cloud builders. Which is why we saw the rise of Fulcrum Microsystems and Mellanox Technology from 2009 forward (Fulcrum was eaten by Intel in 2011 and Mellanox by Nvidia in 2020), and then the next wave of merchant chip suppliers such as Barefoot Networks (bought by Intel in 2019), Xpliant (bought by Cavium in 2014, which was bought by Marvell in 2018), Innovium (founded by people from Broadcom and Cavium), Xsight Labs, and Nephos. And of course, now Cisco Systems is trying to make up to them all by having its Silicon One ASICs available as merchant silicon.
Tan buys companies to extract profits, and did not hesitate to sell of the Vulcan Arm server processors that Broadcom had under development to Cavium, which was eaten by Marvell and which last year shut down its own Triton ThunderX3 chip because the hyperscalers and cloud builder customers it was counting on are going to build their own Arm server chips. And with old Broadcom having basically created the modern switch ASIC merchant silicon market with its Trident and Tomahawk ASICs, the new Broadcom, we speculate, wanted to price its ASICs more aggressively than the smaller old Broadcom would have felt comfortable doing. The new Broadcom has a bigger share of wallet at these hyperscalers and cloud builders, many of whom have other devices they build that need lots of silicon. So there is a kind of dtente between buyer and seller.
Were not going to hurt each other, are we? Something like that.
We also have to believe all of this competition has directly or indirectly hurt the Broadcom switch and router ASIC business. And hence we also believe Tan has asked the hyperscalers and cloud builders to pay more for their ASICs than they would like. And they have more options than they have had in the past, but change is always difficult and risky.
We dont know what switch ASICs the hyperscalers cloud vendors use, but we have to assume that all of these companies have tried out their homegrown network operating systems on each and every one of them as they tape out and get to first silicon. They pick and choose what to roll out where in their networks, but the safe bet in recent years has been Broadcom Tomahawk ASICs for switching and Jericho ASICs for routing, and maybe having Mellanox or Innovium or Barefoot as a testbed and negotiating tactic.
This tactic may have run its course at AWS, and if it does, the cause will be not only hard-headedness and pride, but the success that the $350 million acquisition of Annapurna Labs back in 2015 just when AWS was hitting a financial wall with networking at the same time as Avago was buying Broadcom and the Tomahawk family was coming into being specifically for hyperscalers and cloud builders in demonstrating that homegrown chips can break the hegemony of Intel in server CPUs.
So thats the landscape within which AWS may have decided to make its own network ASICs. Lets look at this from a few angles. First, economics.
What we have heard is that AWS is only spending around $200 million a year for Broadcom switch and routing ASICs. We believe the number is larger than that, and if it isnt today, it surely will be as AWS grows and its networking needs within each datacenter grow.
Lets play with some numbers. Take a typical hyperscale datacenter with 100,000 servers. We dont care if they are compute servers or storage servers, by and large, on average, there is something on the order of 200,000 CPUs in those machines. From the people we talk to who do server CPUs for a living, you need to consume somewhere between 400,000 to 500,000 servers a year meaning 800,000 to 1 million CPUs a year for the cost and trouble of designing chips, which will cost somewhere between $50 million and $100 million per generation. This does not include the cost of fabbing these chips, packaging them up, and sending them to ODMs to build systems. AWS clearly consumes enough servers in its 25 regions and 80 availability zones (which have multiple datacenters at this scale each).
Now, depending on the network topology, those 100,000 servers with 200,000 server chips will require somewhere between 4,000 and 6,000 switch ASICs to make a leaf/spine Clos network to interlink all of those machines. Assuming an average of two datacenters per availability zone (a reasonable guess) across those 25 regions, and an average of around 75,000 machines per datacenter (not all of the datacenters are full at any given time), thats 12 million servers and 24 million server CPUs. Depending on the topology, we are now talking about somewhere between 480,000 and 720,000 switch ASICs in the entire AWS fleet. Servers get replaced every three years, on average, but switches tend to hang on for as long as five years. Sometimes longer. So that is really like 100,000 to 144,000 switch ASICs a year. Even if it is growing at 20 percent per year, it is nothing like the server CPU volumes.
But, that is only counting datacenter switching. Those numbers do not include all of the switching AWS needs, which will be part of its Amazon Go stores and its Amazon warehouses, themselves massive operations. If the server fleet keeps growing, and these other businesses do, too, then Amazons overall datacenter and campus and edge switching needs could easily justify the cost and hassle of making networking chips. Add in routing, and a homegrown ASIC set with an architecture that spans both switching and routing as Cisco is doing with its own Silicon One (which Cisco no doubt would love to sell to AWS but good luck with that), and you can pretty easily justify an investment of around $100 million per generation of ASIC. (Barefoot Networks raised $225.4 million to do two generations of its Tofino ASICs, and Innovium raised $402.3 million to get three Teralynx ASICs out the door and have money to sell the stuff and work on the fourth.)
Now, lets add some technical angles. What has made Annapurna Labs so successful inside of AWS is the initial Nitro Arm processor announced in 2016, which was used to create a SmartNIC what many in the industry are now calling a Data Processing Unit or a Data Plane Unit, depending, but still a DPU either way for virtualizing storage and networking and getting these off the hypervisors on the servers. The new Nitros get damned near all of the hypervisor off the CPU now, and are more powerful. These have spawned the Graviton and Graviton2 CPUs used for raw computing, the Inferentia accelerators for machine learning inference, and the Trainium accelerators for machine learning training. We would not be surprised to see an HPC variant with big fat vectors come out of AWS and also do double duty as an inference engine on hybrid HPC/AI workloads.
Homegrown CPUs started out in a niche and quickly spread all around the compute inside of AWS. The same could happen for networking silicon.
AWS controls its own network operating system stack for datacenter compute (we dont know its name) and can port that stack to any ASIC it feels like. It has the open source Dent network operating system in its edge and Amazon Go locations.
Importantly, AWS may look at what Nvidia has done with its Volta and Ampere GPUs and decide it needs to create a switch that speaks memory protocols to create NUMA-like clusters of its Trainium chips to run ever-larger machine learning training models. It could start embedding switches in Nitro cards, or do composable infrastructure using Ethernet switching within racks and across racks. What if every CPU that AWS made had a cheap-as-chips Ethernet switch instead of an Ethernet port?
Here is the important thing to remember. The people from Annapurna Labs who made the move over to AWS have a deep history in networking and some of their closest colleagues are now at Xsight Labs. So maybe this talk about homegrown network ASICs is all a faint as AWS is testing out ASICs from Xsight Labs to see how they compete with Broadcoms chips. Or maybe it is just a dance before AWS just acquires Xsight Labs as it did Annapurna Labs after choosing it to be its Nitro chip designer and manufacturer ahead of its acquisition by AWS. Last December, Xsight Labs announced it was sampling two switch ASICs in its X1 family, one that had 25.6 Tb/sec of aggregate bandwidth that could push 32 ports at 800 Gb/sec and a 12.8 Tb/sec one that could push 32 ports at 400 Gb/sec using 100 Gb/sec SerDes with PAM4 encoding.
It would be difficult, but not impossible, to put together a network ASIC team of the caliber that AWS needs. But as we pointed out, the Annapurna Labs people are a good place to start. And we fully realize that it takes a whole different set of skills to design a packet processing engine wrapped by SerDes than it takes to design and I/O and memory hub wrapped by a bunch of cores. (But when you say it that way. . . )
A little history is in order, we think. It all starts with Galileo Technology, which was founded in 1993 by Avigdor Willenz to focus on wait for it developing a high performance MIPS RISC CPU for the embedded market. This chip Galileo created ended up being used mostly in data communications gear, and was eventually augmented with designs based on PowerPC cores, which eventually came to rule the embedded market before Arm chips booted them out. In 1996, Galileo saw an opportunity and pivoted to create the GalNet line of Ethernet switch ASICs for LANs (launched in 1997) and eventually extended that to the Horizon ASICs for WANs. At the height of the dot-com boom in early 2000, Willenz cashed out and sold Galileo to Marvell for $2.7 billion.
Among the many companies that Willenz has invested in with that money and helped propel up and to the right was Habana Labs, the AI accelerator company that Intel bought for $2 billion in 2019, the above mentioned Ethernet switch ASIC maker Xsight Labs, and Annapurna Labs, which ended up inside of AWS. Guy Koren, Erez Sheizaf, and Gal Malach, who all worked at EZChip, a DPU maker that was eaten by Mellanox to create its SmartNICs and that is now at the heart of Nvidias DPU strategy, founded Xsight Labs. (Everybody knows everybody in the Israeli chip business.) Willenz is the link between them all, and has a vested interest in flipping Xsight Labs just as he did Galileo Technology and Annapurna Labs (and no doubt hopes to do with distributed flash block storage maker Lightbits Labs, where Willenz is chairman and investor.)
Provided the price is not too high, it seems just as likely to us that AWS will buy the Xsight Labs team as it is to be building its own team from scratch. And if not, then maybe AWS has considered buying Innovium, which is also putting 400 Gb/sec Ethernet ASICs into the field. With its last round of funding, Innovium reached unicorn status, so its $1.2 billion valuation might be a little rich for AWSs blood. A lot depends on how much traction Innovium can get selling Teralynx ASICs outside of whatever business we suspect that it is already doing with AWS. Oddly enough, that last round of money may make Innovium too expensive for AWS to buy.
If you put a gun to our heads, we think AWS is definitely going to do its own network ASICs. It is just a matter of time for economic reasons that include the companys desire to vertically integrate core elements of its stack. This may or may not be the time, despite all the rumors going around. Then again, everything just gets more expensive with time and scale. Whatever is going on, we suspect we will hear about custom network ASICs at some point at re:Invent perhaps even this fall.
IFI Techsolutions Has Earned the Windows Server and SQL Server Migration to Microsoft Azure Advanced Specialization – PR Newswire India
MUMBAI, India, April 8, 2021 /PRNewswire/ -- IFI Techsolutions (www.ifi.tech), today announced it has earned the Windows Server and SQL Server Migration to Microsoft Azure advanced specialization, a validation of a solution partner's deep knowledge, extensive experience and expertise in migrating Windows Server and SQL Server-based workloads to Azure.
Only partners that meet stringent criteria around customer success and staff skilling, as well as pass a third-party audit of their migration practices, are able to earn the Windows Server and SQL Server Migration to Azure advanced specialization.
As companies look to modernize their applications and take full advantage of the benefits that cloud computing can deliver, and with the recent end-of-support for Windows Server 2008 R2 and SQL Server 2008 R2, they are looking for a partner with advanced skills to assess, plan, and migrate their existing workloads to the cloud.
Speaking about this recent achievement, Puneet Bajaj, Partner - IFI Techsolutions, said, "At IFI Tech we have developed deep expertise in datacentre migration & transformation to Microsoft Azure cloud by providing solutions to some large enterprises like L&T Group, Hiranandani Financial Services and Reliance Communications. Our experience in assisting 200+ customers globally to migrate more than 3000 Microsoft Windows Server & SQL Servers to Azure has led to our achievement of this advanced specialisation. We would like to thank Microsoft for this program that will not only help partners like us showcase our capabilities, but also help customers recognise IFI Tech as an experienced and trusted partner."
Gavriella Schuster, Corporate Vice President, One Commercial Partner (OCP) at Microsoft Corp., added, "The Windows Server and SQL Server Migration to Microsoft Azure advanced specialization highlights thepartners who can beviewed as most capable when it comes to migrating Windows-based workloads over to Azure.IFI Techsolutionsclearly demonstrated that they have both the skills and the experience to offer clients a path to successful migration so that they can start enjoying the benefits of being in the cloud."
About IFI Techsolutions:
IFI Techsolutions (www.ifi.tech) is a cloud consulting, managed services and a 2020 Microsoft Partner of the Year Finalist founded by ex-Microsoft employees Ankur Garg and Puneet Bajaj. IFI Techsolutions has delivered over 370 projects, 45,000 consulting hours, migrated 4,100-plus servers for more than 260 global customers and has presence in India, US, UK, Australia, and UAE.
For More Details Contact:Shravani Bhalerao[emailprotected][emailprotected]+91-8898857355Digital Marketing ManagerIFI Tech
SOURCE IFI Techsolutions Private Limited
Europe High Speed Connector Market Forecast to 2027 – COVID-19 Impact and Regional Analysis By Product and Application – Yahoo Finance
The Europe high speed connector market is expected to grow from US$ 543. 47 million in 2019 to US$ 910. 30 million by 2027; it is estimated to grow at a CAGR of 6. 8 % from 2020 to 2027. Growth in networking and communication sector is expected to accelerate the growth of the Europe high speed connector market.
New York, April 07, 2021 (GLOBE NEWSWIRE) -- Reportlinker.com announces the release of the report "Europe High Speed Connector Market Forecast to 2027 - COVID-19 Impact and Regional Analysis By Product and Application" - https://www.reportlinker.com/p06027795/?utm_source=GNW The high speed connector has an increasing demand in the networking and communication sector to offer enhanced connectivity, reliability, and high speed transfer.
Advancement in connector technology is improving the device performance as well as offering better space utilizing solutions.Market players across Europe are developing the high speed end connectors for SERDES (Serializer/Deserializer) applications such as supercomputing, high speed networking, and supercomputing.
For instance, in April 2019, Fairview Microwave Inc., a brand of Infinite Electronics has introduced a new series of high speed PCB (Printed Circuit Board) connectors supporting the high data rates and VSWR as low as 1.10:1. Newly introduced high-performance end launch connectors are ideal for supercomputing, cloud servers, and high speed networking. Rising demand for high speed PCB connection in supercomputers and networking applications, and increasing consumption of high speed networking devices and computers are propelling the growth of the Europe high speed connector market. Increasing adoption of consumer electronics such as digital cameras, printers, TVs, gaming consoles, and connected devices such as laptops, tablets, PCs, and smartphones require high speed connectivity solutions. To meet these requirements, advanced high speed connector is introduced in the market to facilitate faster, efficient, and compact interfaces. The demand for high speed connector devices across Europe is increasing with the rising consumption of computers, networking devices, cloud servers, and advanced electronics, which would propel the Europe high speed connector market in coming years.Countries in Europe, especially the UK and Russia, are adversely affected due to the COVID-19 pandemic.The European electronic component manufacturers are unable to supply their products in adequate quantities to respective customers.
This is majorly due to the significant disruption in the supply chain of raw material.The electronic components or high speed connector manufacturers are witnessing two or more weeks of delay in receiving raw materials, which is pressurizing the manufacturers to slow down the production pace.
This factor is hindering the revenue generation.The UK and France have numerous electronic component manufacturers, while the Russia has significantly large number of equipment manufacturing sectors ranging from communication equipment manufacturers, aerospace & defense contractors, and automotive OEMs.
The emergence of the COVID-19 outbreak is hampering the businesses of European equipment manufacturers as well as the high speed connector market. Additionally, the UK, France, Germany, and Italy are experiencing the second wave of the outbreak, which is again compelling the manufacturers to restrict their production unit workforce, resulting in deceleration in high speed connector demand.Based on application, the aerospace & defense segment led the Europe high speed connector market in 2019.Communication has utmost importance in the aerospace & defense sector to form seamless integration of systems on plane and on ground.
The connectors are developed to meet the various standards such as VITA, VPX protocols, and other to ensure connectivity even in critical situation and harsh environments.The increasing advancements in the aerospace industry such as lightweight aircraft, long-range connecting devices, and unmanned aerial vehicles (UAVs) are pushing the boundaries in connectivity techniques and creating demand for advanced high speed connector systems.
The digitalization, remote control, and other advanced technologies are taking control over maneuvers operations, which are creating demand for high speed data transfers solutions. Hence, the aerospace & defense segment leads the market owing to continuous demand for high speed connectivity solutions, which would drive the Europe high speed connector market in coming years.The overall Europe high speed connector market size has been derived using both primary and secondary sources.To begin the research process, exhaustive secondary research has been conducted using internal and external sources to obtain qualitative and quantitative information related to the market.
The process also serves the purpose of obtaining an overview and forecast for the Europe high speed connector market with respect to all the segments pertaining to the region.Also, multiple primary interviews have been conducted with industry participants and commentators to validate the data, as well as to gain more analytical insights into the topic.
The participants of this process include industry experts such as VPs, business development managers, high speed connector market intelligence managers, and national sales managers, along with external consultants such as valuation experts, research analysts, and key opinion leaders, specializing in the Europe high speed connector market. Fujitsu Limited; Hirose Electric Co., Ltd.; IMS Connector Systems GmbH; Molex, LLC; OMRON Corporation; SAMTEC, Inc.; TE Connectivity Ltd.; and Yamaichi Electronics Co., Ltd. are among the players operating in the market.Read the full report: https://www.reportlinker.com/p06027795/?utm_source=GNW
About ReportlinkerReportLinker is an award-winning market research solution. Reportlinker finds and organizes the latest industry data so you get all the market research you need - instantly, in one place.
Bruce Moxon, Sr. HPC Architect, Microsoft Azure Global
Dynamic infrastructure, DevOps, and Infrastructure-as-code (IaC) terms historically at odds with High Performance Computing are finding life in todays Cloud. And the implications can be transformational for organizations and for applications long thought to be firmly embedded in on-premises infrastructure.
Efforts that, in the past, required a long process of collective requirements gathering, architectural compromises, and five-year capital budget commitments to procure a multi-departmental compute cluster can now be addressed more effectively and quickly and in a manner better suited to the needs of individual departments or projects.
Some of the recent Cloud trends contributing to this more agile, DevOps-oriented approach to HPC include:
Timely access to state-of-the-art server infrastructure. High core counts, large high-bandwidth memory configurations, and the latest GPU and FPGA accelerator architectures are now available to fuel the most aggressive scientific, technical, and AI applications. Microsofts recent announcement of its Azure HBv3 virtual machines based on the AMD EPYC 7003 series CPU ushers in a new era of first day availability of such hardware.
Integrated High Performance Networking. These high performance VMs are configurable with high performance, low latency networking such as NVIDIA Mellanox HDR 200 Gb/s InfiniBand within proximity groups that ensure consistent, minimal latency for tightly coupled (MPI) parallel HPC applications typically used in largescale sensor processing or physical simulation, including seismic processing, structural analysis, and computational fluid dynamics.
High Performance Storage. High-performance managed service storage offerings, such as Cray ClusterStor in Azure and Azure NetApp Files are readily available in many geographies. And software-defined storage solutions built on high-performance SSD-based VMs, and storage acceleration and caching solutions like Microsoft Azure HPC cache can be deployed nearly instantly in front of large, cost-effective object storage in a model reminiscent of what LANLs Gary Grider first termed campaign storage.
Software. HPC-ready software stacks in the Cloud include optimized VM images with the latest MPI and GPU libraries, NVIDIA Mellanox InfiniBand drivers, and high performance storage clients. And many of the traditional HPC commercial packages now support pay-for-use SaaS licensing models, including ANSYS Cloud on Azure Marketplace.
Job Scheduling and Dynamic Clusters. Both traditional HPC job schedulers (e.g., Slurm, LSF, PBS) and cloud-native schedulers like Azure Batch are readily available and can be used in conjunction with dynamically scaled server clusters like Azure CycleCloud and Azure VM Scale Sets, providing access to thousands of cores nearly instantly, and only for the duration of the job. And loosely coupled application architectures can further leverage spot instance pricing to stretch development dollars or trade cost for time-to-results.
Dynamic Deployment and Orchestration (IaC). Both cloud-specific deployment tools (e.g. Azure Resource Manager) and open source, cross-cloud infrastructure-as-code approaches, including Hashicorp Terraform and Ansible are increasingly being used to dynamically instantiate HPC infrastructure when needed, to scale it for production runs, and to relinquish it when no longer needed.
Azure Globals Specialized Workloads Customer Enablement team develops and publishes reference architectures and associated deployment scripts as examples of HPC-on-demand infrastructure. These are available in Github repositories and can be used to seed customer POCs or serve as a foundation for HPC DevOps initiatives. The team works closely with customers and ISV partners to accelerate development projects, and with Azure Global Engineering to drive additional product needs.
The first of these is available at http://github.com/Azure/azurehpc. More application-specific scenarios are also in development, as is an extensible HPC on-demand platform.
Together, these developments are ushering in a new era of high performance computing with the speed and agility of the Cloud. A well-planned, cloud-native HPC strategy complements traditional on-premises HPC investments. It leverages a DevOps model to improve access to the right infrastructure at the right time in the development cycle, to opportunistically incorporate rapid technology advances, and to accelerate innovation and improve time-to-results.
Bruce Moxon is a Senior HPC Architect in the Specialized Workloads Customer Enablement group, where he works with customers in deploying dynamic HPC infrastructure for applications in Finance and Life Sciences.
More performance and choice with new Azure HBv3 virtual machines for HPC | Azure Blog and Updates | Microsoft Azure
azure cray clusterstor Bing
Azure HPC Cache | Microsoft Azure
CycleCloud HPC Cluster & Workload Management | Microsoft Azure
Batch Compute job scheduling service | Microsoft Azure
Ansys Cloud (microsoft.com)
Read the original:
HPC DevOps: Powered by the Cloud - insideHPC