The need for data governance policies and procedures to help in protecting sensitive data has been magnified recently...
as a result of both the European Union's General Data Protection Regulation taking effect in May and the disclosure that the data of 87 million Facebook users was improperly shared with analytics firm Cambridge Analytica during the 2016 U.S. presidential campaign.
This is becoming increasingly important as more organizations combine customer data integration processes with advanced analytics applications. Even when individual source data sets don't contain protected personal data, data fusion may link records from different data sets and expose information that should be protected.
The raised awareness about protecting sensitive data has drawn particular attention among data practitioners. Many have not been properly educated on the ins and outs of data protection, so it's not surprising there is some confusion about the terminology associated with the policies, processes and technologies used to guard against unauthorized exposure of an individual's personal data. The terms used most frequently -- data security, data access control and data protection -- are often presumed to have the same meaning. However, there are some governance and operational differences worth noting.
Data security refers to protecting data assets. This involves ensuring that the individuals or applications attempting to access data are who or what they say they are.
Authentication methods such as a username/password combination or biometric data -- such as a fingerprint or iris scan -- are used to ensure a user's identity. Policies can be defined to include limitations of access and use when protecting sensitive data. Finally, backing up data using encrypted formats and the subsequent erasure of raw, readable data can help secure it.
Access control is about authorization -- defining and managing specific roles with different levels of access. Access control is used to manage the types of individual or application roles that have access to the content of a data asset, specify the circumstances under which it may be viewed and the time frame for which it's available.
In addition, access control methods can be used to log who attempts to gain access to a data set, whether access was granted and what was touched.
Data protection is about preventing the exposure of sensitive content should security barriers be breached or bypassed. Examples of data protection techniques include encryption at rest, where the entire asset is transformed into a non-readable format, and data masking, where selected attribute values are transformed into a form that is similar to the original, but that includes fake data. Encryption must be reversible -- the encrypted asset must be able to transform back into its original form. Masking, however, does not need to be reversible.
All three of these concepts are necessary for protecting sensitive data. Security is needed to define boundaries and create the barriers around data assets that contain sensitive information. Access control is meant to ensure that only authorized entities are able to see protected content, and methods like masking and encryption prevent inadvertent exposure should the security barriers be breached and unauthorized users gain access to the asset.
Finally, it is important to understand the criteria to assess data sensitivity. When focusing on personal data, we can use the guidance specified in the General Data Protection Regulation: Personal data is any information relating to an individual who can be identified by reference to a data attribute's value, such as a name, identification number, email address or residential address. Any data asset containing a data element with a value that's determined to be personal data would be classified as sensitive and require the aforementioned protections.
There are other concerns when protecting sensitive data, though -- is it attorney-client information, classified information, corporate intellectual capital or even information about the methods employed for data protection? Developing policies to assess and classify a data asset's level of sensitivity is a good data management practice to accompany the methods of protecting against unauthorized data exposure.