5 Ways to Improve Data Management in the Cloud – ITPro Today

Managing data can be challenging in any environment. But data management in the cloud is especially difficult, given the unique security, cost and performance issues at play. With that reality in mind, here are some tips to help IT teams optimize cloud data management and strike the right balance among the various competing priorities that shape data in public, private or hybrid cloud environments.

Before delving into best practices for cloud data management, lets briefly discuss why managing data in the cloud can be particularly challenging. The main reasons include:

Those are the problems. Now, lets look at five ways to tackle them.

A basic best practice for striking the right balance between cloud storage costs and performance is to use data storage tiers. Most public cloud providers offer different storage tiers (or classes, as they are called on some clouds) for at least their object storage services.

The higher-cost tiers offer instant access to data. With lower-cost tiers, you may have to wait some amount of time--which could range from minutes to hours--to access your data. Data that doesnt require frequent or quick access, then, can be stored much more cheaply using lower-cost tiers.

For many teams, object storage services like AWS S3 or Azure Blob Storage are the default solution for storing data in the cloud. These services let you upload data in any form and retrieve it quickly. You dont have to worry about structuring the data in a particular way or configuring a database.

The downside of cloud object storage is that you usually have to pay fees to interact with the data. For instance, if you want to list the contents of your storage bucket or copy a file, youll pay a fee for each request. The request fees are very small--fractions of a penny--but they can add up if you are constantly accessing or modifying object storage data.

You dont typically have to pay special request fees to perform data operations on other types of cloud storage services, like block storage or cloud databases. Thus, from a cost optimization perspective, it may be worth forgoing the convenience of object storage in order to save money.

One of the key security challenges that teams face when managing cloud data is the risk that they dont actually know where all of their sensitive data is within cloud environments. It can be easy to upload files containing personally identifiable information or other types of private data into the cloud and lose track of it (especially if your cloud environment is shared by a number of users within your organization, each doing their own thing with few governance policies to manage operations).

Cloud data loss prevention (DLP) tools address this problem by automatically scanning cloud storage for sensitive data. Public cloud vendors offer such tools, such as Google Cloud DLP and AWS Macie. There are also third-party DLP tools, like Open Raven, that can work within public cloud environments.

Cloud DLP wont guarantee that your cloud data is stored securely--DLP tools can overlook sensitive information--but it goes a long way toward helping you find data that is stored in an insecure way before the bad guys discover it.

Data egress--which means the movement of data out of a public cloud environment--is the bane of cloud data cost and performance optimization. The more egress you have, the more youll pay because cloud providers bill for every gigabyte of data that moves out of their clouds. Egress also leads to poorer performance due to the time it takes to move data out of the cloud via the Internet.

To mitigate these issues, make data egress mitigation a key priority when designing your cloud architecture. Dont treat egress costs and performance degradations as inevitable; instead, figure out how to store data as close as possible to the applications that process it or the users who consume it.

In addition to allowing you to store data, all of the major clouds now also let you process it using a variety of managed data analytics services, such as AWS OpenSearch and Azure Data Lake Analytics.

If you want to analyze your data without having to move it out of the cloud (and pay those nasty egress fees), these services may come in handy. However, youll typically have to pay for the services themselves, which can cost a lot depending on how much data you process. There may also be data privacy issues to consider when analyzing sensitive cloud data using a third-party service.

As an alternative, you can consider installing your own, self-managed data analytics platform in a public cloud, using open source tools like the ELK Stack. That way, you can avoid egress by keeping data in the cloud, without having to pay for a third-party managed service. (Youll pay for the cloud infrastructure that hosts the service, but that is likely to cost much less than a managed data analytics service.)

The bottom line here: Managed cloud data analytics may be a useful tool, but deploy them wisely.

Like many other things, data management is just harder when you have to do it in the cloud. The good news is that, by being strategic about which cloud storage services you use, how you manage data in the cloud and how you factor data management into your cloud architecture, you can avoid the cost, performance and security pitfalls of cloud data.

Read more from the original source:
5 Ways to Improve Data Management in the Cloud - ITPro Today

Related Post

Comments are closed.