Did someone tell your data to shelter in place? That wouldn't make any sense, would it? Ironically, for vast troves of valuable enterprise data, that might as well be the case, because massive, compute-bound data silos are practically everywhere in the corporate world.
Hadoop played a role in creating this scenario, because many large organizations sought to leverage the twin promises of low-cost storage and massively parallel computing for analytics. But a funny thing happened to the yellow elephant: It was largely obviated by cheap cloud storage.
Seemingly overnight, the price of cloud storage dropped so precipitously that the cost-benefit analysis of using the Hadoop Distributed File System (HDFS) on-premises for new projects turned upside down. Even the term "Hadoop" disappeared from the names of major conferences.
That's not to say there isn't valuable data in all those HDFS repositories, however. Many important initiatives used this technology in hopes of generating useful insights. But with budgets moving away from Hadoop, another strategy is required to be successful.
What about computing? Suffice to say, cloud providers now offer a robust service in this territory. And relatively recent innovations such as separating computing from storage have also played a part in paving the way for cloud-based computing to take on all manner of workloads.
So the cloud now easily eclipses most on-premises environments in all the major categories: speed, cost, ease of use, maintenance, scalability. But there are barriers to entry; or at least pathways that should be navigated carefully while making the move. Mistakes can be very costly!
But how do you get the data there? Amazon actually offers a "snow truck" that will come to your data center, load it up one forklift at a time, and haul it, old-school, to its facility. That approach can certainly work for a quick-and-relatively-dirty solution, but it ignores the magic of cloud.
As the concept of "cloud-native" gets hashed out on whiteboards in boardrooms around the business world, the reality taking shape is that a whole new generation of solutions is being born. These systems are augmented with high-powered analytics and artificial intelligence.
This new class of application is almost exclusively built on a microservices architecture with Kubernetes as the foundation. There is tremendous value to this approach, because scalability is built into its DNA. Taking advantage of this new approach requires a commitment to change.
Simply shipping your data and applications en toto to a cloud provider absolutely does not solve this challenge. In fact, it will likely result in a significant rise in total cost of ownership (TCO), thus undermining a major driver for moving to the cloud.
Another strategy involves porting specific data sets into the cloud to deploy the power of all that computation. This often involves making copies of the data. While change-data-capture can be used to keep these disparate environments in sync, there are downsides to this approach.
In the first place, CDC solutions always need to be meticulously managed. Small amounts of data drift can quickly become larger problems. This is especially problematic when the derived analytics are used for mission-critical business decisions, or customer-experience initiatives.
Secondly, by going down this road, organizations risk the proliferation of even more data silos--this time in the cloud. And while cloud storage is getting cheaper, the cost of egress can creep up and throw budgets sideways; this is not good in a post-COVID world.
Remember, the standard redundancy of Hadoop was to have three copies of every datum, which is good for disaster recovery but rather taxing overall, both in terms of throughput and complexity. While moving into the new world of cloud computing, we should avoid old errors.
A different approach to bridging the worlds of on-prem data centers and the growing variety of cloud computing services is offered by a company called Alluxio. From their roots at the Berkeley Amp Labs, they've been focused on solving this problem.
Alluxio decided to bring the data to computing in a different way. Essentially, the technology provides an in-memory cache that nestles between cloud and on-prem environments. Think of it like a new spin on data virtualization, one that leverages an array of cloud-era advances.
According to Alex Ma, director of solutions engineering at Alluxio: "We provide three key innovations around data: locality, accessibility and elasticity. This combination allows you to run hybrid cloud solutions where your data still lives in your data lake."
The key, he said, is that "you can burst to the cloud for scalable analytics and machine-learning workloads where the applications have seamless access to the data and can use it as if it were local--all without having to manually orchestrate the movement or copying of that data."
In this sense, Alluxios approach bridges the best of both worlds: You can preserve your investments in on-prem data lakes while opening a channel to high-powered analytics in the cloud, all without the encumbrance of moving massive amounts of data here or there.
"Data locality means bringing the data to compute, whether via Spark, Presto, or Tensorflow, Ma said. In this scenario, Alluxio is installed alongside the compute framework and deploys unused resources on those servers to provide caching tiers for the data.
There are various ways to get it done, depending upon the topology of the extant information architecture. In some environments, if Presto is using a lot of memory, Alluxio can allocate SSDs on the appropriate machines for optimized caching.
If you're tying into HDFS, Presto can make the request, and Alluxio's intelligent multi-tiering then uses whatever the most efficient approach might be--spanning memory, SSD or spinning disc. It all can be optimized as Alluxio monitors data access patterns over time.
Regardless of which tools an organization uses--Tensorflow, Presto, Spark, Hive--there will be different usage patterns across CPU, GPU, TPU and RAM. In the case of RAM and available disk types, Alluxio can work with whatever resources are available.
"Spark is less memory-intensive," Ma said, "so we can allocate some memory. So you have choice to figure out what you want to allocate and where. Alluxio allows you to seamlessly access the data in the storage area, wherever it may be."
There's also the concept of a Unified Name Space. "What it allows you to do is have a storage configuration that's centrally managed, Ma said. You're not going into Spark and Presto to set it all up; you're able to configure Alluxio once, and then Spark or Presto communicate to Alluxio."
The general idea is to create a high-speed repository of data that allows analysts to get the speed and accuracy they demand without giving into the temptation of data silos. Think of it as a very large stepping stone to the new normal of multi-cloud enterprise computing.
"With Alluxio, we sit in the middle and offer interfaces on both sides; so we can talk to a variety of different storage layers, Ma said. We act as a bridging layer, so you can access any of these technologies. In short, you can have your data cake and eat it too.
Like any quality abstraction layer, solving the data challenge in this manner enables companies to leverage their existing investments. Data centers will have a very long tail, and cloud services will continue to evolve and improve over time. Why not get the best of both worlds?
Eric Kavanaghis CEO of The Bloor Group, a new-media analyst firm focused on enterprise technology. A career journalist with more than two decades of experience in print, broadcast and Internet media, he also hostsDM Radio,and the ongoing Webcast series for the Global Association of Risk Professionals (GARP).
- Could Snowflake Rival Amazon in Cloud Storage and Services? Here's What You Need to Know About the New So - Tech Times - September 15th, 2020
- How Cloud Computing Can Deal With Lightning Strikes and Hackers - Carnegie Endowment for International Peace - September 15th, 2020
- How to approach IT logging in the cloud vs. on premises - TechTarget - September 15th, 2020
- This lifetime web hosting subscription comes with up to 1TB of storage - Mashable - September 15th, 2020
- Keep It in the Cloud! Best Cloud Storage Systems of 2020 - iDrop News - September 6th, 2020
- Impact of COVID-19 on Cloud Storage Software Market 2025 Expected to reach Highest CAGR including major key players Amazon Web Services, Microsoft,... - September 6th, 2020
- Facebook adds cloud storage providers Dropbox and Koofr to its photo and video portability tool - Digital Information World - September 6th, 2020
- Cloud storages you need to know - The Star, Kenya - September 6th, 2020
- How COVID-19 is Impacting the Consumer Cloud Storage Services Market by Industry Analysis, by Type, Application and Top Players:Apple, Google, Box,... - September 6th, 2020
- Cloud Storage Gateway Market to Witness Stunning Growth by 2027; Key Players are Riverbed Technology, SoftNAS, Inc., Oracle, Microsoft, Nasuni... - September 6th, 2020
- COVID-19 Is Driving a Cloud Computing Surge That Will Only Continue | Opinion - Newsweek - September 6th, 2020
- Asia Pacific Personal Cloud Market Industry Analysis and Market Forecast (2019-2026) _ Hosted Types, Revenues, User Type, and Geography. - Galus... - September 6th, 2020
- Amazon's Blink Unveils New Wireless Security Cameras with HD Video, Flexible Storage Options, and New Battery Expansion Pack Cameras Start at $79.99... - September 2nd, 2020
- Cloud Storage Software Market Will Raise Beyond Imagination over Period 2025 | Microsoft, Oracle, Rackspace Hosting, Red Hat, IBM - Scientect - September 2nd, 2020
- Stand Alone Cloud Storage Market Current Industry Size and Future Prospective with Key Players, Drivers and Trends - The Daily Chronicle - September 2nd, 2020
- Media And Entertainment Storage TAM To Exceed $16B By 2025 - Forbes - September 2nd, 2020
- The Launching Ceremony for XnMatrix Wrapped Up, the Next Generation of Cloud Computing Eco-System Sets Sail - PRNewswire - September 2nd, 2020
- Why not open our own Container Registry, muses GitHub as it gives orgs a hand at resource-sharing DEVCLASS - DevClass - September 2nd, 2020
- Sharing responsibility: Why we need to work together to keep the cloud secure - ComputerWeekly.com - September 2nd, 2020
- Data breach exposes tens of thousands of NSW drivers licences online - ABC News - September 2nd, 2020
- 10 Key Takeaways From NetApp CEO George Kurian: Cloud, Coronavirus And Growth - CRN: Technology news for channel partners and solution providers - September 2nd, 2020
- Responding to Cloud Misconfigurations with Security Automation and Common-Sense Tips - Security Boulevard - September 2nd, 2020
- How to Prepare for the Next Time the Cloud Goes Down - Gizmodo - September 2nd, 2020
- Demand for Consumer Cloud Storage Services Market from Major End-use Sectors to Increase in the Near Future - The Scarlet - August 29th, 2020
- Prevent the storage and data security risks of remote work - TechTarget - August 29th, 2020
- Samsung kills Gallery Sync and Drive support in favor of OneDrive - Android Central - August 29th, 2020
- 4 great Android apps to edit the perfect photo - Phandroid - News for Android - August 29th, 2020
- Google Cloud and STS to Automate US Navy Maintenance Inspections Using AI and ML Technology - PRNewswire - August 29th, 2020
- New innovative report on Cloud Storage Gateway Market Future Growth Analysis, Business Demand and Opportunities to 2027 - The Scarlet - August 29th, 2020
- Global Cloud Based Storage Market 2020 Industry Outlook, Comprehensive Insights, Growth and Forecast 2026 - Good Night, Good Hockey - August 29th, 2020
- In quest to go paperless (and save money), Mizuho to start charging for bank books - Japan Today - August 29th, 2020
- NetApp posts strong Q1, plots big re-organisation Blocks and Files - Blocks and Files - August 29th, 2020
- The Handiest Video Doorbells to Remotely Test Who's At your Doorstep - Herald Planet - August 29th, 2020
- Explore the best free cloud backup services on the market - TechTarget - August 26th, 2020
- Integrated Media Technologies Joins the Active Archive Alliance - Sports Video Group - August 26th, 2020
- Storj Labs and FileZilla Collaborate to Offer Secure File Storage in the Remote Work Era - Database Trends and Applications - August 26th, 2020
- Cloud Compliance Frameworks: What You Need to Know - Security Boulevard - August 26th, 2020
- Reevert Unveils Advanced Tools to Enhance Network Security and Efficiency for Remote Workforces - PRNewswire - August 26th, 2020
- Enhancing Network Visibility for SD-WAN in the Era of Cloud and SaaS - The Fast Mode - August 26th, 2020
- Where to Back Up Your Smartphone Photos Online (and Why You Should) - Lifehacker - August 24th, 2020
- NordLocker encryption heads to the cloud - IT PRO - August 24th, 2020
- What Is the OneDrive File Size Limit? Microsoft's 2020 Updates - Cloudwards - August 24th, 2020
- A Security Flaw In 'Manage Versions' Feature Of Google Drive Could Allow Malware Attackers Trick Victims Into Installing Rogue Code - Digital... - August 24th, 2020
- Medical Image Cloud Market Expected to Witness High Growth over the Forecast Period 2020 2025 - The Daily Chronicle - August 24th, 2020
- What Is OneDrive? A 2020 Guide to Microsoft's Cloud Storage - Cloudwards - August 20th, 2020
- Stand Alone Cloud Storage Market Growth, Industry Verticals and Forecast to 2026 - Scientect - August 20th, 2020
- Outlook on the Healthcare Data Storage Global Market to 2026 - Opportunity Analysis for New Entrants - ResearchAndMarkets.com - Business Wire - August 20th, 2020
- Personal Cloud Storage Market by Top Manufacturers with Production, Price, Revenue (value) and Market Share to 2026 - The Daily Chronicle - August 16th, 2020
- Pure Storage and Cohesity in Partnership to Deliver Rapid Recovery at Scale - insideHPC - August 16th, 2020
- Cloud Storage Systems Market Analysis, Size, Regional Outlook, Competitive Strategies and Forecasts to 2025 - eRealty Express - August 16th, 2020
- Cloud Storage Market Size by Top Companies, Regions, Types and Application, End Users and Forecast to 2027 - Bulletin Line - August 16th, 2020
- How to install the Seafile cloud storage solution on Ubuntu Server 20.04 - TechRepublic - July 31st, 2020
- Five on-premise and cloud options for network-attached storage - ComputerWeekly.com - July 31st, 2020
- Want to back up the worlds largest SSD? Use this 100TB cloud storage - TechRadar - July 31st, 2020
- 4 reasons why Tresorit is the best cloud storage service - Tech Advisor - July 31st, 2020
- Cloud Technologies Your Business Needs in 2020 - The Seeker - July 31st, 2020
- FBI Alerts to Rise in Targeted Netwalker Ransomware Attacks - HealthITSecurity.com - July 31st, 2020
- The entire Netflix movie archive will fit on this 90PB storage system - TechRadar - July 31st, 2020
- Student discounts: the best offers in 2020 - Creative Bloq - July 31st, 2020
- Cloud Storage Market to Grow at a CAGR of 21.9% from 2020 to 2027 to Reach $222 Billion by 2027 - PRNewswire - July 23rd, 2020
- Google Cloud Claims Another Win With Box Partnership - Forbes - July 23rd, 2020
- Stand Alone Cloud Storage Market Size, Share, Growth Rate, Revenue, Applications, Industry Demand & Forecast to 2025 - 3rd Watch News - July 23rd, 2020
- Global Enterprise Cloud Storage Market 2020 by Company, Regions, Type and Application, Forecast to 2025 - Cole of Duty - July 23rd, 2020
- Stand Alone Cloud Storage Market: The Development Strategies Adopted By Major Key Players And To Understand The Competitive Scenario - 3rd Watch News - July 23rd, 2020
- BitDam Advanced Threat Protection now available on Microsoft Azure Marketplace - Help Net Security - July 23rd, 2020
- Nexsan Unity taps into cloud and Assureon archive - TechTarget - July 23rd, 2020
- IPVanish July sale: three months of VPN cover for the price of one with this deal - Tom's Guide UK - July 23rd, 2020
- Commvault integrates Hedvig with HyperScale X appliance Blocks and Files - Blocks and Files - July 23rd, 2020
- Q&A: Sophos poll shows how attackers are taking advantage of cloud migration to wreak havoc - Security Boulevard - July 23rd, 2020
- Life After COVID 19: E-Discovery Considerations for Attorneys and Clients - JD Supra - July 23rd, 2020
- 4 Ways to Advance Your Tech Without Sacrificing Security - Security Boulevard - July 7th, 2020
- Cloud Storage Market Is expected to Witness Significant Growth between 2020 to 2028| Top Key Players- AWS, IBM, Microsoft, Google, Oracle, HPE - Owned - July 7th, 2020
- Software-defined storage: It's a Thing Blocks and Files - Blocks and Files - July 7th, 2020
- Microsoft takes legal action against COVID-19-related cybercrime - Microsoft on the Issues - Microsoft - July 7th, 2020
- How Vodafone is helping MSMEs gear up for their business revival - YourStory - July 7th, 2020
- IP Video Surveillance And VSaaS Market Growth Analysis By Manufacturers, Regions, Types and Application Forecast - Apsters News - July 7th, 2020
- Alternatives to banned apps Shareit and Xender for file transfer - Digit - July 4th, 2020
- I Don't Care How Great These OneDrive Improvements Are, I'm Not Using It - Gizmodo UK - July 4th, 2020
- Cloud Based Storage Market with Report In Depth Industry Analysis on Trends, Growth, Opportunities and Forecast till 2024 - AlgosOnline - July 4th, 2020
- Global Cloud Storage Software Market 2020, Analysis by Growing Demand, Types, Application, Top Trends, User-Demand and Opportunities Assessment till... - July 4th, 2020