In this third post in our three part series I’ll focus on two bedrock components of the European Union’s General Data Protection Regulation (GDPR): data mapping and data inventories. Previously, we covered the core pillars of Obsidian’s data guardianship and why we position ourselves as data guardians not privacy watch dogs.
In Article 30 and Article 35, the GDPR requires that companies make records of their processing activities and prepare a data impact assessment. Many privacy experts, including ours, have adopted the best practice of conducting data inventories and diagram a mapping of data flows throughout an organization’s data architecture. This work is neither as novel or as headline grabbing as GDPR elements like the right to be forgotten, but data mapping and ongoing inventory updating are the cornerstones of any robust practice of data guardianship. It would be impossible to fulfill a request to delete all of the data held about a particular subject without knowing 1) what data are held and 2) where the data are held. Knowing which data is entering, exiting, being shared, analyzed, deleted, stored indefinitely, stored temporarily, for which purposes, and under which security protections establishes the footing for organizations to establish a clear 10,000 view over their entire data ecosystems.
In security, we are already well aware of how difficult it is for organizations to establish and maintain an omniscient view of their IT landscape as they combat the typical messy entropic nature of application sprawl that is accelerating as organizations move into the cloud. Obligatory plug for our product: Obsidian is designed to get account and privilege entropy under control.
In other words, creating a detailed data flow map and taking a data inventory are key practices for doing data guardianship. These two – the data map and the data inventory – work together as living documents that require careful tending to keep pace with infrastructure changes. The best versions of these documents are built by a cross-functional teams of senior engineers, privacy officers, and security officers. From a security standpoint, the data map can depict adherence to security best practices outlined in Article 32 such as encryption in transit and at rest, where error logging tools help ensure availability and timely resolution of any access gaps, and where any automatic security processes are running.
For GDPR native companies that were born in 2017 or later when it was clear that the GDPR would require attention to data protection, the work required to comply with GDPR is more straightforward than for older companies. It’s easier to draw the architectural plan for an unbuilt building than to have to reverse engineer a plan from an existing building. In new companies, teams are smaller, infrastructures are simpler, and good data guardianship habits fight fewer old-habit headwinds.
At Obsidian, a data inventory and data map update is triggered whenever:
a. the Obsidian product infrastructure
b. this website or elsewhere in our marketing operations
c. our HR or operations functions (Obsidian offers a similarly robust data guardianship to our employees, too.)
a. We are alert to legal changes in the territories where our current customers operate.
b. We monitor changes in the privacy policies and terms of services in the third parties integrated in our data ecosystem.
The GDPR creates an imperative for young companies with big ambitions like Obsidian to adopt a mature privacy posture. Being born at roughly the same time as this powerful piece of legislation allowed us to adopt the everyday practices that enable a culture of data guardianship across our organization from our earliest days.
Fun with References and Vocabulary:
French sociologist on documentation
Most of the data inventory tools out there pitch the inventory process as a spreadsheet game. We have emphasized data mapping in this post. But why? What we described above goes beyond what the GDPR requires!
The logical flow diagram of data architecture is required in a Data Protection Impact Assessment, though with a lot less detail than we use at Obsidian. We’ve discovered that the data map, not the data inventory, plays a key sociological role. The data map is what French science and technology scholar Bruno Latour calls an “inscription device”, a device used to record elements of processes into documents. The work of inscribing makes it easier for the knowledge that has been baked into a process to be captured accessed by other groups. Data guardianship relies on inscription (and its reverse: transcription) to allow information to move effectively between engineers, data protection officers, legal counsel, privacy officers, customers, and data subjects. In our experience, a single page data map performs this task better than the sprawling spreadsheet containing our data inventory does.
Data inventory: a systematic list of all types of data controlled and/or processed by a company or sub-group of a company, typically including the type of entity to which the data pertains (individuals, companies, bots, countries, etc), the legal purpose for having and processing the data, the data retention period, where the data are held and analyzed, and the category of data. In the case of taking data inventories for data held by others, such as employee data, it is also good practice to list the contact information for the data protection officer.
Data map: a diagram that depicts all the components of data architecture for ingesting, collecting, storing, performing analyses, and interacting with data. The data map also often contains information about when data are encrypted in-transit and at rest, the type of authentication required to access each infrastructure component, where logging tools are deployed, and where users are able to input free-form data (if anywhere).