DLP in the Cloud

Posted under: Research and Analysis

It’s been quite a while since we’ve updated our data loss prevention (DLP) research. It’s not that DLP doesn’t continue to be an area of focus (it does), rather there have been a lot of other shiny things that have kept our attention of late. Yeah, like cloud. Well it turns out a lot of organizations are using this cloud thing now, and that means they inevitably have questions about if and how their existing controls (like DLP) map into this new world.

So as we update our Understanding and Selecting DLP paper, we’d be remiss if we didn’t include a discussion about how potential leakage in cloud-based environments should be handled. Yet, let’s not get the cart ahead of the horse. First we need to define what we mean by cloud and applicable use cases for DLP.

Now we could bust out the Cloud Security Alliance guidance and hit you over the head with a bunch of cloud definitions. But suffice it to say, from a data access standpoint you are most likely dealing with:

  • SaaS: Software as a Service (SaaS) is the new back office. That means you have critical data in a SaaS environment, whether you know about it or not, and it should be protected.
  • Cloud File Storage: These offerings allow you to extend a device’s file system to the cloud, replicating and syncing between devices and facilitating the sharing of data. Yes, these services are a form of SaaS (and Platform as a Service), but given the amount of critical data in these networks and the fact they work differently than your typical SaaS application, we’ll treat these differently.
  • IaaS: Infrastructure as a Service (IaaS) is the new data center. That means many of your critical applications (and data) will be moving to a cloud service provider, most likely Amazon Web Services, Microsoft Azure, or Google Cloud Platform. And inspection of data traversing a cloud-based application is well… different, and that means protecting said data is also… different.

DLP is predicated upon scanning data at rest and inspecting and enforcing policies on data in motion, there isn’t a lot of applicability to IaaS. Why? Because there really aren’t endpoints per se in a IaaS environment. Data will be within either structured (like a database) or unstructured (a filesystem) datastores. Data protection for structured datastores defaults to application-centric methods, and unstructured cloud file systems are really just cloud file storage (covered later). So trying to insert DLP agents within an application stack isn’t really the most efficient or effective means of protecting that application.

Additionally in IaaS, traditional network DLP approaches don’t work very well in IaaS either. You have limited visibility into the cloud network and to inspect traffic, you need to route the traffic through an inspection point and that can negatively impact the architecture of the cloud, specifically elasticity and anywhere access. Moreover, a greater percentage of cloud network traffic is encrypted, so even with access to the network traffic inspecting it at scale presents many implementation challenges.

Thus we’ll scope this Cloud DLP discussion around SaaS and cloud file storage.

Cloud versus Traditional Data Protection

Clearly cloud is different, but what exactly does that mean? If we boil it down to its fundamental core, you still have to do the same functions whether the data resides in a 20 year old mainframe or within the ether of multi-cloud and SaaS environments. In order to protect the data, you have to know where it is (discover), how it’s being used (monitor), enforce policies to govern what is allowed and by whom and additional security controls (protect).

When looking at Cloud DLP a lot of users equate protection to encryption but that’s a massive topic with a lot of complexities in SaaS. The best idea is to check out our recent research on Multi-Cloud Key Management. There will be a lot of detail in the paper, but suffice it to say that managing keys across various cloud and on-prem environments presents a significantly more complicated key management environment, and you’ll need rely more on your provider and architect data protection and encryption directly into the cloud technology stack.

Now thinking about discovery, do you remember in the olden days – like 7 years ago – when your critical data was either in your data centers or on devices that you controlled? To be clear, it wasn’t easy to find all of your critical data, but at least you knew where to look. You could search all of your file servers and databases for critical data, profile/fingerprint that data and then look for it on your devices and egress points on the network.

But as critical data started moving to SaaS applications and cloud file storage (sometimes embedded within SaaS apps), controlling data loss became more challenging because the data didn’t always traverse an egress point. Ergo the emergence of Cloud Access Security Brokers (CASB) to figure out which cloud services were in use and then you’d understand (kind of) where you critical data may be. At least you had a place to look, right?

Enforcement of the data usage policies is also a bit different in that you don’t control the SaaS apps, nor do you have an inspection/enforcement point on the network where you can look for sensitive data and block it from leaving your network. You’ve consistently heard about the lack of visibility in the cloud and this is another example where the cloud really messes with how you used to do security.

So what’s the answer? It’s found in 3 letters that you should be pretty familiar with. A. P. I.

APIs are your friend

That’s right, many SaaS apps and cloud file storage services provide APIs which allow you to interact with their environments and provide visibility and some measure of enforcement for your data protection policies. Basically, many DLP offerings have integrated with the leading SaaS and cloud file storage offerings and provide you with the ability to:

  1. Know when files are uploaded to the cloud and analyze them.
  2. Know who is doing what with the files.
  3. Encrypt or otherwise protect the files.

As you can see, you don’t need to see the data pass by, as long as the API reliably tells you new data has been moved into their environment. So the key in using DLP solutions in the cloud is to ensure integration with APIs available for those services.

So what happens when you don’t (or can’t) have APIs to provide integration with the cloud environments? You need to see the data somehow, and that’s where a Cloud Access Security Broker (CASB) comes into play.

Co-existence with CASB

CASBs have lots of functions, including providing visibility of cloud service usage within your environment. The CASB can also inspect traffic directed to these cloud services by running the traffic through a proxy. Of course, this can add some inefficiencies by routing traffic unnaturally through the proxy, but the impact will be dependent on the latency and response time requirements of the application. Many CASB tools can also connect to cloud providers directly over API to evaluate activity without requiring a proxy. This is more-dependent on the cloud provider having an API with the needed capabilities than limitations with the CASB products, which is why proxy-mode is often needed.

Given the CASB vendors are inspecting traffic, they can claim to provide DLP-like functions for traffic they see heading for cloud environments. Obviously DLP on your CASB is not going to provide any visibility or enforcement for on-prem data. Thus your decision point involves determining whether you want a consistent policy across both on-prem and cloud environments. Or whether separate solutions to monitor the content is sufficient.

There isn’t a right or a wrong answer for this decision, really more depending on whether the policies you implement on your internal networks map to the data moving to SaaS and cloud file storage.

Consistency in Workflows

Once an alert triggers, where the data resides doesn’t impact the processes your internal folks use to verify the potential leak and then to assess the damage. Thus any workflow you have in place to handle data leakage should be extensible to wherever the data resides. Of course, the tools to perform these processes will be different and your access to the systems potentially compromised is radically different. Meaning you have no access to SaaS systems that are potentially compromised. Either way, once you have a verified leak it’s time for your incident response process to kick in.

So yes, trying to prevent data leaks in SaaS and cloud file storage can be very challenging. That being said, like with most things cloud, the first place to start is to revisit your processes and the technologies in place to see whether your existing environment can handle the cloud.

Yet the one thing we know is that there will be more cloud in use tomorrow than there is today, so the sooner you get your arms around protecting your content – regardless of where it resides – the better it is for your organization.

– Mike Rothman
(0) Comments
Subscribe to our daily email digest

from DLP in the Cloud


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s