Paid Feature The cloud has a habit of transforming on-premises technologies that have existed for decades.
It absorbed business applications that used to run exclusively on local servers. It embraced the databases they relied on, presenting an alternative to costly proprietary implementations. And it has also driven new efficiencies into one of the most venerable on-premises data analytics technologies of all: the data warehouse.
Data warehousing is a huge market. Allied Market Research put it at $21.18bn in 2019, and estimates that it will more than double to $51.18bn in 2028. The projected 10.7 percent CAGR between 2020 and 2028 comes from a raw hunger for data-driven insights that we've never seen before.
It isn't as though data warehousing is a new concept. It has been around since the late eighties, when researchers began building systems that funneled operational data through to decision-making systems. They wanted that data to help strategists understand the subtle currents that made a business tick.
This product category initially targeted on-premises installations, with big iron servers capable of handling large computing workloads. Many of these systems were designed to scale up, adding more processors connected by proprietary backplanes. They were expensive to buy, complex to operate, and difficult to maintain. The upshot, AWS claims, was that companies found themselves spending a lot on these implementations and not getting enough value in return.
As companies produced more data, it became harder for these implementations to keep up. Data volumes exploded, driven not just by the increase in structured records but also by an expansion in data types. Unstructured data, ranging from social media posts to streaming IoT data, has sent storage and processing requirements soaring.
Cloud computing evolved around the same time, and AWS argues that it changed data warehousing for the better. Data Warehousing has been popular with customers in sectors like financial services and healthcare, which have been heavy analytics users.
Manage data at any scale and velocity while remaining cost effective
But the cloud has opened up the concept to far more companies thanks to lower prices and better performance, according to AWS. Applications previously restricted to multinational banks and academic labs are now open to smaller businesses. For example, youre able to perform data analytics in the cloud with benefits like scale, elasticity, time to value, cost efficiency and readily available applications.
The numbers bear this out. According to Research and Markets, the global market for data warehouse as a service (DWaaS) products will enjoy a 21.7 percent CAGR between 2021 and 2026, growing from $1.7bn to $4.5bn.
The largest cloud players have leaped on this trend, with Microsoft offering its Synapse service and Google running BigQuery. AWS announced Redshift as the first cloud data warehouse to address the market in 2012. The idea was pretty simple, AWS told us. The company wanted to give customers a scalable solution, where they could use the flexibility of the cloud to manage data at any scale and velocity while remaining cost effective.
Unlike online transaction processing databases like Amazon Aurora, Redshift targets online analytics processing (OLAP), offering support for fast queries thanks to scalable nodes with massive parallel processing (MPP) in a cluster. The cloud-based data warehouse follows the AWS managed database ethos. Rather than relying on a customer's administrators to take care of maintenance tasks, the company handles it behind the scenes in the cloud.
Aside from standing up hardware, this includes patching the software and handling backups and recovery. That means developers can focus on building applications ranging from modernizing existing data warehouse strategies through to accelerating analytics workloads, which it does using back-end parallel processing to spread queries over up to 128 nodes. Companies can use it for everything from analyzing global sales data to crunching through advertising impression metrics.
AWS also highlights other applications that can draw on cloud-based data warehouse technology, including predictive analytics, which enable companies to mine historical data for insights that could help to chart future events. Redshift also helps customers with applications that are often time critical, AWS says. These include recommendation and personalization, and fraud detection.
Performance at the right price is key, asserts AWS, which reports that customers latency requirements for processing and analyzing their data are shortening, with many wanting to make things almost real time.
AWS benchmarked Redshift against other market players and found price performance up to three times better than the alternatives. The system's ability to dynamically scale the number of nodes in a cluster helps here, as does its ability to access data in place from various sources across a data lake.
Data sharing is a cumbersome process, traditionally, where files are uploaded manually from one system and copied to another. This system, AWS says, does not provide complete and up-to-date views of the data as the manual processes introduce delays, human error and data inconsistencies, resulting in stale data and poor decisions.
In response to feedback from customers who wanted to share data at many levels to enable broad and deep insights but also minimize complexity and cost, AWS has introduced a capability that overcomes this issue.
Announced late last year, Amazon Redshift data sharing enables you to avoid copies. The new capability enables customers to query live data at their convenience, and get up to date views across organizations, customers and partners as the data is updated. In addition, Redshift integrates with AWS Data Exchange, enabling customers to easily find and subscribe to third-party data in AWS Data Exchange without extracting, transforming and loading it.
Amazon Redshift data sharing is already proving a hit with AWS customers, who are finding new use cases such as data marketplaces and workload isolation.
Data lakes have evolved as companies draw in data of different types from multiple sources. When unstructured data comes in such as machine logs, sensor data, or clickstream data from websites, you don't know about its quality or what insights you're going to find from it.
AWS told us many customers have asked for data stores where they can break free of data silos and land all of this data quickly, process it, and move it to more SLA-intensive systems for query and reporting like data warehouses and databases.
The cloud is the perfect place to put this data thanks to commodity storage. Storing data in the cloud is cheap thanks to a mixture of economies of scale on the cloud service provider side, and tiered storage that lets you put data in lower-cost tiers such as S3.
Data gravity is the other driver. A lot of data today begins in the cloud whether it comes from social media, machine logs, or cloud-based business software. It makes little sense to move that data from the cloud to on-premises applications for processing. Instead, why not just shorten the time it takes to get insights from it, AWS says.
The company designed the data warehouse to share information in the cloud, folding in API support for direct access. Redshift can pull in data from S3's cheap storage layer if necessary for fast, repeated processing, or it can access it in place. It also features different types of nodes optimized for storage or compute. It can interact with data in Amazon's Aurora cloud-native relational database, and other relational databases via Amazon Relational Database Services (RDS).
It also includes support for other interface types. Developers can import and export data from other data warehousing systems using open data formats like Parquet and optimized row columnar (ORC). Client applications also access the system via standard SQL, ODBC, or JDBC interfaces, making it easy to connect with business intelligence and analytics tools.
The ability to scale the storage layer separately to the compute nodes makes the system more flexible and eliminates network bottlenecks, the cloud service provider says.
Cloud databases also provide application developers with other services that they can use to enhance those insights. One of the most notable for AWS is its machine learning capability. ML algorithms are good at spotting probabilistic patterns in data, making them useful for analytics applications, but inference - the application of statistical models when processing new data - takes a lot of computing power. Scalable cloud computing power makes that easier, AWS says.
Cloud-based machine learning services are also easy for companies to consume because they are pluggable with data warehouses via application programming interfaces (APIs). AWS makes these available to anyone who knows SQL. Customers can use SQL statements to create and use machine learning models from data warehouse data using Redshift ML, a capability of Redshift that provides integration with Amazon SageMaker, a fully managed machine learning service.
In 2019, Amazon Redshift also introduced support for geospatial data by adding a new data type to Redshift: geometry. That supports coordinate data in table columns, making it possible to handle geospatial polygons for mapping purposes. This makes it possible to combine location information with other data types when making conventional data warehousing queries and building machine learning models for Redshift.
As data warehousing continues its move to the cloud, it shows no sign of slowing down. Customers can choose offerings from the largest cloud service providers or from third-party software vendors alike. Evaluation criteria will depend on each customer's individual strategy, but the need to scale compute and storage capabilities is sure to factor highly in any decision. One thing's for sure: the cloud will help customers as their big data gets bigger still.
This article is sponsored by AWS.
Continue reading here:
The rise of the cloud data warehouse - The Register
- Blackline Safety : What is the cloud and why should businesses care about cloud-connected safety? - marketscreener.com - January 22nd, 2022
- DataHEALTH, Inc. Begins Notification of Cybersecurity Incident - PRNewswire - January 22nd, 2022
- Data Centers Must Rethink Interconnection in Order to Evolve - Data Center Frontier - January 22nd, 2022
- Onlive Server Launched Canada VPS Hosting with Upto 48 CPU CORE and Cloud VPS Control Panel - Digital Journal - January 22nd, 2022
- Gartner: IT spending forecast points to skills rebalance - ComputerWeekly.com - January 22nd, 2022
- IBM is selling off its Watson Health assets - Boston News, Weather, Sports | WHDH 7News - January 22nd, 2022
- Global Cloud Hosting Service Market Focusing on Trends and Innovations during the Period 2021 to 2027 Discovery Sports Media - Discovery Sports Media - January 22nd, 2022
- Why COVIDtests.gov worked where HealthCare.gov stumbled - FedScoop - January 22nd, 2022
- VTEX : 5 benefits of SaaS and a cloud commerce ecosystem - marketscreener.com - January 22nd, 2022
- Ford and ADT team up to prevent theft from vehicles - TechCrunch - January 22nd, 2022
- Patent Awarded to Nanoprecise Sci Corp for its Automated Predictive Maintenance Solution - PR Newswire India - January 22nd, 2022
- Recent Analysis on Cloud Hosting Service Market 2022-2028 Top Trends, Business Opportunity, and Growth Strategy LSMedia - LSMedia - January 22nd, 2022
- Google Project Iris AR Headset in the Works, May Feature In-House Processor: Report - Gadgets 360 - January 22nd, 2022
- Cyber Security in 2022: What Should You Know? - GISuser.com - January 22nd, 2022
- Cellular connectivity: the final piece of the IoT puzzle - ITProPortal - January 22nd, 2022
- The opportunities and challenges of data center industry in 2022 - Analytics India Magazine - January 16th, 2022
- The rising threat of cyber criminals targeting cloud infrastructure in 2022 - Help Net Security - January 16th, 2022
- Strata Identity Hosts Complimentary Webinar Featuring ESG Analyst on Identity and Policy Management for Multi-Cloud in 2022 - Business Wire - January 16th, 2022
- ISG to Conduct Study on Private and Hybrid Cloud Providers - StreetInsider.com - January 16th, 2022
- Strengthening the availability chain - ITProPortal - January 16th, 2022
- NordVPN launches open source VPN speed testing tool - IT PRO - January 16th, 2022
- Emby vs Plex: Which media server is right for you? - nation.lk - The Nation Newspaper - January 16th, 2022
- Nutanix Rajiv Ramaswami On His First Year As CEO - Forbes - January 16th, 2022
- ThycoticCentrify adds new security controls and automation to Secret Server - SecurityBrief Asia - January 16th, 2022
- PCIe 6.0 is here with double the bandwidth at 128Gbps - comments - GSMArena.com - January 16th, 2022
- 'Our servers are secure' -- NIMC responds as hacker claims he gained access to NIN database - TheCable - January 16th, 2022
- How this Mumbai startup is carving a niche for itself in the crowded ecommerce delivery space - YourStory - January 16th, 2022
- What begins with a 'B' and is having problems at tsoHost? Hopefully not your website - The Register - January 12th, 2022
- Sensory Extends Voice and Visual AI Platform to the Cloud - Voicebot.ai - January 12th, 2022
- How these 3 Companies Leverage the Hybrid Cloud - TechGenix - January 12th, 2022
- Multi-cloud security doesn't have to be complicated, just consistent - IT-Online - January 12th, 2022
- Podcast: why the future of data management sits in the cloud - Central Banking - January 12th, 2022
- Streaming Analytics Market worth $50.1 billion by 2026 - Exclusive Report by MarketsandMarkets - Yahoo Finance - January 12th, 2022
- Growing Technical Advancements in DevOps Technologies and Their Rising Demand for Optimizing Business Operations to Drive the Global DevOps Market by... - January 12th, 2022
- Data Center Market to Grow by USD 519.34 Bn | Adoption of Multi-cloud and Network Upgrades to Support 5G will Drive Growth | Technavio - PRNewswire - January 12th, 2022
- From 1920s to 2020s: Get ready for a new Roaring Twenties - Big Think - January 12th, 2022
- Post Pandemic: Cloud Adoption Needs to Be Accelerated - APN News - January 12th, 2022
- 2 Growth Stocks That Could Double Your Money in 5 Years - Motley Fool - January 12th, 2022
- Dispelling the top five myths of modern infrastructure - ComputerWeekly.com - January 12th, 2022
- The Future of Records & Compliance With Optimere CEO ICYMI - Government Technology - January 12th, 2022
- Why Banks Are Slow to Embrace Cloud Computing - The New York Times - January 4th, 2022
- How I fell into the self-hosting rabbit hole in 2021 - Windows Central - January 4th, 2022
- The future of web hosting: 5 things to look out for in 2022 - TechRadar - January 4th, 2022
- New Connectivity Is Bringing Roads Up to Speed - Wired.co.uk - January 4th, 2022
- Healthcare for the new normal world reimagined with digital analytics at the core - ETHealthworld.com - January 4th, 2022
- Opinion: White Renegade of the Year 2021 Gregory Hood - Prescott eNews - January 4th, 2022
- Best of 2021 Why Kubernetes is the King of Containerized Tools - Container Journal - December 27th, 2021
- Cloud and Edge Computing Will Be Key for Government Agencies in 2022 - StateTech Magazine - December 27th, 2021
- Cloud Security Market 2021: Industry Size, Regions, Emerging Trends, Growth Insights, Opportunities, and Forecast By 2027 mainlander.nz -... - December 27th, 2021
- iOS 15.2 Makes it Easier to Replace the Screen on the iPhone 13 - iDrop News - December 27th, 2021
- 4-Year-Old Bug in Azure App Service Exposed Hundreds of Source Code Repositories - The Hacker News - December 27th, 2021
- Top 5 Best Free Linux Cloud Servers  - December 22nd, 2021
- phoenixNAP and MemVerge to Enable Memory Virtualization in Bare Metal Cloud - HPCwire - December 22nd, 2021
- AWS outages and cloud computing, explained - Popular Science - December 22nd, 2021
- How the Cloud Helps With Medical Research and Remote Medicine - Business Insider - December 22nd, 2021
- Contributed | The role of the Cloud in digital transformation - DIGIT.FYI - December 22nd, 2021
- Cloud Security Market 2021 is Expected to be on Course to Achieve Considerable Growth to 2027 mainlander.nz - mainlander.nz - December 22nd, 2021
- How Tripwire Can Be a Partner on Your Zero Trust Journey - tripwire.com - December 22nd, 2021
- Top Cloud Computing Trends Shaping Our IT Landscape in 2022 - CRN - India - CRN.in - December 22nd, 2021
- Medelln Campus writes the future of worldwide industrial automation - Intelligent CIO ME - December 22nd, 2021
- How Kubernetes lowers costs and automates IT department work - The Register - December 22nd, 2021
- 3 Top Trends to Invest in for 2022 (and Beyond) - Motley Fool - December 22nd, 2021
- What Agencies Need to Do to Combat Shadow IT Driven by Cloud Sprawl - Nextgov - December 12th, 2021
- Nvidia CEO Huang jointly files patent for software tech in the metaverse - The Register - December 12th, 2021
- Truly thrifty cloud hosting - Hetzner Online GmbH - December 5th, 2021
- These researchers wanted to test cloud security. They were shocked by what they found - ZDNet - December 5th, 2021
- What Is The Cloud And Where Is It Used? - Fossbytes - December 5th, 2021
- JetBrains starts adding remote dev functionality on IDEs and introduces Fleet - ZDNet - December 5th, 2021
- Your iPhones best trick is tucked away inside Photos app do you know it?... - The Sun - December 5th, 2021
- Tech Investment Alert: Check Out Top Five Tech Stocks Today - Analytics Insight - December 5th, 2021
- Tencent Cloud and AMD Join Forces to Launch StarLake Servers in Southeast Asia - HPCwire - December 3rd, 2021
- A Climate Dystopia Displayed at the UMOCA with 'the weight of a cloud' - Daily Utah Chronicle - December 3rd, 2021
- AWS Announced General Availability of Elastic Disaster Recovery - InfoQ.com - December 3rd, 2021
- The Benefits of Using a Share File Server in Education - eLearningInside News - eLearningInside News - December 3rd, 2021
- 5 questions for Mark Mills on the cloud revolution - Washington Examiner - December 3rd, 2021
- Inspur Information Impresses in AI Performance with 7 Titles in MLPerf Training v1.1 - Business Wire - December 3rd, 2021
- How Secure Is iMessage? | Leaked FBI Document Reveals the Truth - iDrop News - December 3rd, 2021
- Netweb Technologies Bags Award from MeitY in Contribution to the Manufacturing Sector - News Nation - December 3rd, 2021
- FTC is Suing NVIDIA to Stop Its $40B Acquisition of Arm Amidst Concerns of Potential Reliance from Rival Firms Should Deal Push Through - Tech Times - December 3rd, 2021
- Securing the edge server infrastructure from the ground up - The Register - December 3rd, 2021