The need for information management and data classification to comply with GDPR

Approaching the new General Data Protection Regulation (GDPR), coming into effect from May 2018, struggling companies based in Europe or having personal data on people residing in Europe to find their most valuable assets in the organization – their sensitive data.

The new regulation requires organizations to prevent data breaches of personally identifiable information (PII) and erase all data if requested by an individual. After removing all PII data, companies must prove that they have been completely removed to this person and to the authorities.

Most companies today understand their commitment to demonstrate accountability and compliance, and therefore began to prepare for the new regulation.

There is so much information out there about ways to protect your sensitive data, so much so that one can get overwhelmed and start pointing in different directions in hopes of accurately hitting the target. If you plan your data management ahead, you can still reach the deadline and avoid penalties.

Some organizations, mostly banks, insurance companies and manufacturers, possess a huge amount of data as they produce data at an accelerated pace by changing, saving and sharing files, thus creating terabytes and even petabytes of data. Problems with these types of companies are finding their sensitive data in millions of files, in structured and unstructured data, which unfortunately is in most cases an impossible mission.

The following personal identification data is classified as PII under the definition used by the National Institute of Standards and Technology (NIST):

or Full Name

or home address

or email address

o National identification number

or passport number

o IP address (when linked but not PII by itself in US)

o Vehicle registration plate number

o driver’s license number

o Face, fingerprint or handwriting

o Credit card numbers

o Digital identity

or date of birth

or birthplace

o Genetic information

or phone number

o Login name, screen name, nickname or handle

Most organizations that possess PII by European citizens require to detect and protect against any breaches of PII data and to delete PII (often called the right to be forgotten) from company data. Official Journal of the European Union: Regulation (EU) 2016/679 By the European Parliament and the Council of 27 April 2016 it states:

“Regulators should monitor the application of the provisions of this Regulation and contribute to its consistent application throughout the Union to protect natural persons in relation to the processing of their personal data and to facilitate the free flow of personal data within the Union. market. “

In order to enable companies holding PII by European citizens to facilitate a free flow of PII on the European market, they must be able to identify their data and categorize it according to the sensitivity level of their organizational policy.

They define the flow of data and market challenges as follows:

“Rapid technological development and globalization have brought new challenges to the protection of personal data. The scope of collecting and sharing personal data has increased significantly. Technology allows both private companies and public authorities to use personal data on an unprecedented scale in order to execute their activities Individuals are increasingly making personal information available publicly and globally. Technology has transformed both the economy and social life and should further facilitate the free flow of personal data in the Union and the transfer to third countries and international organizations while ensuring a high level of level of protection of personal data. “

Phase 1 – Registration of data

So, the first step to take is to create a data bar that allows you to understand where their PII data is being thrown across the organization, and will help decision makers discover specific data types. The EU recommends getting an automated technology that can handle large amounts of data by automatically scanning them. No matter how large your team is, this is not a manageable project when faced with millions of different types of files hidden in different areas: in the cloud, warehouse and in desktops.

The main concern for these types of organizations is that if they are unable to prevent data breaches, they will not comply with the new EU-GDPR regulation and may face heavy penalties.

They need to appoint specific staff responsible for the entire process, such as a Data Protection Officer (DPO) who mainly handles the technology solutions, a Chief Information Governance Officer (CIGO), usually it is a compliance officer , and / or a Compliance Risk Officer (CRO). This person must be able to control the entire process from end to end and be able to provide management and authorities with full transparency.

“The controller shall pay particular attention to the nature of the personal data, the purpose and duration of the proposed processing or operations, as well as the situation in the home country, third country and final country of destination and should provide appropriate safeguards to protect the fundamental rights of natural persons and freedoms regarding the processing of their personal data.

PII data can be found in all types of files, not only in PDFs and text documents, but they can also be found in image documents – for example, a scanned check, a CAD / CAM file that can contain the IP of a product, a confidential sketch, code or binary file, etc. ‘. The common technologies today can extract data from files, making the data hidden in text easy to find, but the rest of the files, which in some organizations, such as manufacturing, can hold most of the sensitive data in image files. These file types cannot be accurately detected, and without the right technology capable of detecting PII data in file formats other than text, this important information can easily be missed and cause significant damage to the organization.

Phase 2 – Data categorization

This phase consists of data mining actions behind the scenes created by an automated system. The DPO / controller or information security decision maker must decide whether to track certain data, block the data or send warnings about a data breach. To perform these actions, he must view his data in separate categories.

Categorizing structured and unstructured data requires complete identification of the data while maintaining scalability – efficient scanning of all databases without “boiling the sea”.

The DPO is also required to maintain data visibility across multiple sources and to quickly present all files related to a particular person by specific devices such as: name, DOB, credit card number, social security number, telephone, email, mail address etc.

In the case of a data breach, the DPO reports directly to the highest management level of the controller or processor or to the Information Security Officer responsible for reporting this breach to the relevant authorities.

EU GDPR Article 33 requires reporting of this violation to the authorities within 72 hours.

Once the DPO has identified the data, the next step should be to label / tag the files according to the sensitivity level defined by the organization.

As part of complying with regulatory compliance, organization files must be accurately labeled so that these files can be tracked on premises and even when shared outside the organization.

Phase 3 – Knowledge

Once the data is tagged, you can map personal information across networks and systems, both structured and unstructured, and easily traceable, enabling organizations to protect their sensitive data and enable their end users to use and share files safe and thus improve data loss prevention.

Another aspect to consider is protecting sensitive information from insider threats – employees trying to steal sensitive data such as credit cards, contact lists, etc. or manipulate the data to gain some benefit. These types of actions are difficult to detect on time without automatic tracking.

These time-consuming tasks apply to most organizations and awaken them to search for effective ways to gain insight into their business data so they can base their decisions.

The ability to analyze inherent data patterns helps the organization to get a better vision of their business data and to point out specific threats.

Integration of an encryption technology allows the controller to efficiently track and monitor data, and by implementing internal physical segregation system, he can create a geographical fencing of data through definitions of personal data, cross geo / domains and violation sharing reports when this rule breaks. Using this combination of technologies, the controller enables employees to safely send messages across the organization, between the right departments and out of the organization without being too blocked.

Phase 4 – Artificial Intelligence (AI)

After scanning the data, labeling and tracking it, a higher value for the organization is the ability to automatically screen superficial behavior of sensitive data and trigger safeguards to prevent these events from developing into a data breach event. This advanced technology is known as “Artificial Intelligence” (AI). Here, the AI ​​feature usually consists of a strong pattern recognition component and learning mechanism to enable the machine to make these decisions or at least recommend the Data Protection Officer on the preferred course of action. This intelligence is measured by its ability to become wiser from each scan and user input or changes in data cartography. Eventually, the AI ​​feature builds the organizations digital footprint, which becomes the essential layer between the raw data and the business flowing around data protection, compliance and data management.