Methods for De-Identification
De-identification reduces the risk of linking information in the data set to specific individuals by removing personal identifiers from data.
Pseudonymization
Replaces personal identifiers with pseudonymous identifiers to avoid re-identification of individuals within the data set.
Limitations:
– Can be re-identified using additional information kept separately
– Combinations and indirect identifiers must also be removed or masked
Examples
– Patient 123
– UUIDs
Anonymization
Permanently and completely removes personal identifiers so that the data subject can no longer be identified directly or indirectly.
Limitations
– Data can be de-anonymized by cross-referencing with other data sources
Examples
– Segmented data used to create psychological profiles for marketing purposes
K-Anonymization
Removes all identifiable data, with a scientific guarantee that the individuals cannot be re-identified and the data remain practically useful
Limitations
– Cannot anonymize highly dimensional data sets
– Can skew results in order to suppress identifiable data
Examples
– Used by password keepers and Have I Been Pwned?
Non-Anonymizable Data
Certain data cannot be anonymized due to the highly unique patterns of individual biology or activity. These data are an exception to de-identification.
Limitations
– Certain data cannot be anonymized due to the highly unique patterns of individual biology or activity.
Examples
– Biometric data
– Location history
– Contents of chat logs
– VR scans
Do you have a question about how to de-identify data to reduce the risk of linking specific individuals to data? Do you need guidance or training on your general data security practices? Contact us for a free consultation to see if Gazelle Consulting’s customized compliance services are right for you.