Ceph, the open source integrated file, block and object storage software, can support one billion objects. But can it scale to 10 billion objects and deliver good and predictable performance?
Yes, according to Russ Fellows and Mohammad Rabin of the Evaluator Group who set up a Ceph cluster lab and, by using a huge metadata cache, scaled from zero to 10 billion 64KB objects.
In their soon-to-be published white paper commissioned by Red Hat, Massively Scalable Cloud Storage for Cloud Native Applications, they report that setting up Ceph was complex without actually using that word. We found that, because of the many Ceph configuration and deployment options, it is important to consult with an experienced Ceph architect prior to deployment.
The authors suggest smaller organisations with smaller needs can use Ceph reference architectures. Larger organisations with larger needs better work with Red Hat or other companies with extensive experience in architecting and administering Ceph.
Analysis of unstructured data, files and objects is required to discern patterns and gain actionable insights in a businesss operations and sales.
These patterns can be discovered through analytics and by developing and applying machine learning learning models. Very simply, the more data points in an analysis run, the better the resulting analysis or machine learning model.
It is a truism that object data scales more easily than file storage because it has a single flat address space whereas files exist in a file-folder structure. As the number of files and folders grow, the file access metadata also grows in size and complexity and more so than object access metadata.
File storage is generally used for applications that need faster data access than object storage. Red Hat wants to demonstrate both the scalability of object storage in Ceph and its speed. The company has shown Ceph can scale to a billion objects and perform well at that level via metadata caching on NVMe SSDs.
However, Red Hat wants to go further and has commissioned the Evaluator Group to scale Ceph tenfold, to 10 billion objects, and see how it performed.
The Evaluator test set-up had six workload generating clients driving six object servers. Each pair of these accessed, in a split/shared-nothing configuration, a Seagate JBOD containing 106 x 16TB Exos nearline disk drives; 5PB of raw capacity in total spread across three storage JBODS.
Each object server had dual Xeon 18-core CL-6154 processors, 384GB of DRAM, six Intel DC P4610 NVMe 7.6TB write-optimised NAND SSDs for metadata caching, and Intel memory DIMMs.
Ceph best practice recommends not exceeding 80 per cent capacity and so the system was sized to provide 4.5PB of usable Ceph capacity. Each 64KB object required about 10KB of metadata, meaning around 95TB of metadata for the total of 10 billion objects.
The Evaluator Group testers ran multiple test cycles, each performing PUTS to add to the object count, then GETS and, thirdly, a mixed workload test. The performance of each successive workload was measured to show the trends as object counts and capacity both increased.
The measurements of GETs (reads) and PUTs (writes) performance showed a fairly linear pattern as the object count increased. PUT operations showed linear performance up to 8.8 billion objects; 80 per cent of the systems usable Ceph capacity, and then dropped off slightly. GET operations showed a dip to a lower level around 5 million objects and a more pronounced decline after the 8.8 million objects level.
GET performance declined once the metadata cache capacity was exceeded (yellow line on chart) and the clusters usable capacity surpassed 80 pr cent of actual capacity. Once the caches capacity was surpassed the excess metadata had to be stored on disk drives, and accesses were consequently much slower.
Performance linearity at this level would requite a larger metadata cache.
The deep scrubbing dip on the chart occurred because a Ceph parameter set for deep scrubbing, to help with data consistency, came into operation at 500 million objects. Ceph was reconfigured to stop this.
The system exhibited nearly 2GB/s of sustained read throughput and more than 1GB/sec of sustained write throughput.
The Evaluator Group also tested how Ceph performed with up to 20 million 128MB objects. In this test the metadata cache capacity was not exceeded and performance was linear for reads and near-linear for writes as the object count increased;
There is less metadata with the smaller number of objects, meaning no spill over of metadata to disk. The GET and PUT performance lines are both linearish deterministic is the Evaluator Groups term, with performance of 10GB/sec for both operation types.
Suppliers like Iguazio talk about operating at the trillion-plus file level. Thats extreme but todays extremity is tomorrows normality in this time of massive data growth That suggests Red Hat will have keep going further to establish and then re-establish Cephs scalability credentials.
Next year we might see a 100 billion object test and, who knows, a trillion object test could follow some day.
- Google: We're capping your free cloud storage at 15GB starting next year - TechRepublic - November 16th, 2020
- Google Photos a reminder that cloud storage isn't infinite - TechHQ - November 16th, 2020
- What is the cheapest way to save all those photos in your phone? - Mint - November 16th, 2020
- What is OneDrive? Everything you need to know about Microsoft's cloud storage service - Business Insider - Business Insider - November 13th, 2020
- Samsung Cloud storage shutdown: Everything you need to know - Tom's Guide - November 13th, 2020
- InfluxData adds yet another 2.0 platform and opens a new front to cloud storage - ZDNet - November 13th, 2020
- This week in storage with Kioxia and Elastic Blocks and Files - Blocks and Files - November 13th, 2020
- Global Business Cloud Storage Market Report With in Depth Analysis by Top Key Players- Zoolz OpenDrive JustCloud MozyPro Egnyte CrashPlan Dropbox... - November 13th, 2020
- Global Business Cloud Storage Market 2020 Industry Analysis, Size, Share, Strategies and Forecast to 2026 Wall Street Call - Reported Times - November 13th, 2020
- Cloud Storage Gateway Market Key Players, Product and Production Information analysis and forecast to 2026 - Zenit News - November 13th, 2020
- Cloud Storage Market Consumption Analysis, Business Overview and Upcoming Trends Forecast by 2026 - The Daily Philadelphian - November 13th, 2020
- IMT partners with Dalet to simplify AI and media storage in the cloud - KMWorld Magazine - October 29th, 2020
- NetApp CEO George Kurian: The Digital Economy 'Rewrites The Rules' - CRN: Technology news for channel partners and solution providers - October 29th, 2020
- Catalogic Software Expands its Smart Data Suite to Cloud Storage and Cloud Native Applications - PR Web - October 29th, 2020
- Global Cloud Storage Market Report Forecast to 2025 by Global Market Insights, Key Companies and Driving Trends| Zoolz, OpenDrive, JustCloud, MozyPro,... - October 29th, 2020
- Sycomp launches an unmatched cloud data experience with Sycomp Storage Fueled by IBM Spectrum Scale on Azure - AiThority - October 29th, 2020
- Global Cloud Storage Providers Market Expected To Reach Highest CAGR By 2026: Pertino, Asigra, SoftLayer, StorageCraft, Dropbox etc. - The Think... - October 27th, 2020
- NetApp plays Spot the difference in cloud services build-out Blocks and Files - Blocks and Files - October 27th, 2020
- Private Cloud Storage Market Analysis, Growth Forecast Analysis by Manufacturers, Regions, Type and Application to 2026 - TechnoWeekly - October 27th, 2020
- Global Cloud Storage Software Market Growth, Demand And Threats Analysis 2020 By Regional Overview Of Leading Players, Types, Application and Forecast... - October 27th, 2020
- Tag: Cloud Storage - The Think Curiouser - October 27th, 2020
- Cloud Object Storage Market Analysis 2020-2027: by Key Manufacturers with Countries, Types, Application and Forecast Till 2027 - The Think Curiouser - October 27th, 2020
- Cloud Storage Software Market (2020 to 2025) | Growing Application in Industry, Presents Opportunities and Demand Analysis, Growth, Size and Share -... - October 27th, 2020
- Cloud Storage Market: Qualitative Analysis of the Leading Players and Competitive Industry Scenario, 2023 - The Think Curiouser - October 27th, 2020
- Backblaze B2 Cloud Storage Integrated With Facebook as Photo and Video Transfer Destination - PRNewswire - October 24th, 2020
- Enterprise Data Cloud Storage Software Market is projected to grow at High CAGR during the forecast period, 2026 | MailChimp, Constant Contact,... - October 24th, 2020
- Cloud Storage Software Industry Market 2020: Industry Growth, Competitive Analysis, Future Prospects and Forecast 2026 - PRnews Leader - October 24th, 2020
- 5 top tips for businesses in data backup and recovery - Tom's Guide - October 24th, 2020
- Unified file and object storage: The best of both worlds? - ComputerWeekly.com - October 24th, 2020
- Business Cloud Storage market will continue to boom says analyst - Eurowire - October 24th, 2020
- Veeam Enhances Offerings To Be A 'Perfect' Fit For MSPs - CRN: Technology news for channel partners and solution providers - October 24th, 2020
- Cloud Storage Gateway Market Globally Expected to Drive Growth through 2026 - KYT24 - October 24th, 2020
- BTFS is Poised to Disrupt the Cloud Storage Industry? - ihodl.com - October 15th, 2020
- Reducing cloud storage costs: what you need to know - ITProPortal - October 15th, 2020
- Save an extra 20% on VPNs, password managers, and cloud storage - Mashable - October 15th, 2020
- Cohesity hooks up with AWS to pipe data management-as-a-service at users, starting with backup - Blocks and Files - October 15th, 2020
- The Best Backup and Restore Courses and Online Training for 2020 - Solutions Review - October 15th, 2020
- NetApp Insight 2020 conference news and coverage - TechTarget - October 15th, 2020
- iPad productivity tips: Keyboard tricks, shortcuts, and more - Fast Company - October 15th, 2020
- Cloud At The Edge, GPU Storage And LTO Gen 9 - Forbes - September 25th, 2020
- Zoom is being sued over its cloud storage practices - TechRadar - September 25th, 2020
- Microsofts storage dream: a hard disk drive the size of a wardrobe with Samsung Galaxy S20 parts - TechRadar - September 25th, 2020
- How to Access S3 Buckets from Windows or Linux - ITPro Today - September 25th, 2020
- There is a hole in my cloud bucket - Fudzilla - September 25th, 2020
- Red Hat shifts automated data pipeline into OpenShift Blocks and Files - Blocks and Files - September 25th, 2020
- Seagate gets into object storage with new CORTX software Blocks and Files - Blocks and Files - September 25th, 2020
- Kioxia's Ethernet SSD stirs into EBOF life as architects dream - Blocks and Files - September 25th, 2020
- Google accused of using its online dominance to hold back competitors; to face antitrust lawsuit: Report - Times Now - September 25th, 2020
- Synology DiskStation can keep your digital life organized and safe - The Dallas Morning News - September 25th, 2020
- Looking for a job? The Library is here to help! | Globe Times - swglobetimes.com - September 25th, 2020
- Multi Cloud Storage Market is Projected to Increment at an Eye-Catching CAGR by 2023 | 21.2% CAGR | Know the COVID19 Impact - Verdant News - September 25th, 2020
- Internet of Things (IoT) Cloud Platform Market: Demand Rate with Regional Outlook, Applications, Consumer Profiles & Forecast 2026 - The Daily... - September 25th, 2020
- This tiny CPU firm could play a key role in the future of Apple One - TechRadar - September 25th, 2020
- Could Snowflake Rival Amazon in Cloud Storage and Services? Here's What You Need to Know About the New So - Tech Times - September 15th, 2020
- How Cloud Computing Can Deal With Lightning Strikes and Hackers - Carnegie Endowment for International Peace - September 15th, 2020
- How to approach IT logging in the cloud vs. on premises - TechTarget - September 15th, 2020
- This lifetime web hosting subscription comes with up to 1TB of storage - Mashable - September 15th, 2020
- Keep It in the Cloud! Best Cloud Storage Systems of 2020 - iDrop News - September 6th, 2020
- Impact of COVID-19 on Cloud Storage Software Market 2025 Expected to reach Highest CAGR including major key players Amazon Web Services, Microsoft,... - September 6th, 2020
- Facebook adds cloud storage providers Dropbox and Koofr to its photo and video portability tool - Digital Information World - September 6th, 2020
- Cloud storages you need to know - The Star, Kenya - September 6th, 2020
- How COVID-19 is Impacting the Consumer Cloud Storage Services Market by Industry Analysis, by Type, Application and Top Players:Apple, Google, Box,... - September 6th, 2020
- Cloud Storage Gateway Market to Witness Stunning Growth by 2027; Key Players are Riverbed Technology, SoftNAS, Inc., Oracle, Microsoft, Nasuni... - September 6th, 2020
- COVID-19 Is Driving a Cloud Computing Surge That Will Only Continue | Opinion - Newsweek - September 6th, 2020
- Asia Pacific Personal Cloud Market Industry Analysis and Market Forecast (2019-2026) _ Hosted Types, Revenues, User Type, and Geography. - Galus... - September 6th, 2020
- Amazon's Blink Unveils New Wireless Security Cameras with HD Video, Flexible Storage Options, and New Battery Expansion Pack Cameras Start at $79.99... - September 2nd, 2020
- Cloud Storage Software Market Will Raise Beyond Imagination over Period 2025 | Microsoft, Oracle, Rackspace Hosting, Red Hat, IBM - Scientect - September 2nd, 2020
- Stand Alone Cloud Storage Market Current Industry Size and Future Prospective with Key Players, Drivers and Trends - The Daily Chronicle - September 2nd, 2020
- Media And Entertainment Storage TAM To Exceed $16B By 2025 - Forbes - September 2nd, 2020
- The Launching Ceremony for XnMatrix Wrapped Up, the Next Generation of Cloud Computing Eco-System Sets Sail - PRNewswire - September 2nd, 2020
- Why not open our own Container Registry, muses GitHub as it gives orgs a hand at resource-sharing DEVCLASS - DevClass - September 2nd, 2020
- Sharing responsibility: Why we need to work together to keep the cloud secure - ComputerWeekly.com - September 2nd, 2020
- Data breach exposes tens of thousands of NSW drivers licences online - ABC News - September 2nd, 2020
- 10 Key Takeaways From NetApp CEO George Kurian: Cloud, Coronavirus And Growth - CRN: Technology news for channel partners and solution providers - September 2nd, 2020
- Responding to Cloud Misconfigurations with Security Automation and Common-Sense Tips - Security Boulevard - September 2nd, 2020
- How to Prepare for the Next Time the Cloud Goes Down - Gizmodo - September 2nd, 2020
- Demand for Consumer Cloud Storage Services Market from Major End-use Sectors to Increase in the Near Future - The Scarlet - August 29th, 2020
- Prevent the storage and data security risks of remote work - TechTarget - August 29th, 2020
- Samsung kills Gallery Sync and Drive support in favor of OneDrive - Android Central - August 29th, 2020
- 4 great Android apps to edit the perfect photo - Phandroid - News for Android - August 29th, 2020