
A friend of mine working in the payment industry recently asked me a seemingly simple question, “Are de-identification, pseudonymisation, and tokenisation the same?”
It’s a fair question. Even experienced privacy professionals often struggle to distinguish between these terms. Yet, these techniques form the foundation of compliance under India’s Digital Personal Data Protection Act, 2023 (DPDPA), the EU’s GDPR, and PCI DSS three very different frameworks with overlapping language but distinct intent.
Having implemented privacy frameworks across large fintech operations, I have seen firsthand how these conceptual confusions create real-world compliance challenges for legal, audit, and technical teams.
The Core Difference: Concept vs. Technique
Before diving into definitions, it’s important to separate concepts from techniques.
- De-identification and pseudonymisation are privacy concepts, they describe a desired state of reduced identifiability.
- Tokenisation, on the other hand, is a technical method, a tool that can help achieve that state in certain contexts.
In simpler terms:
De-identification and pseudonymisation = what you want to achieve Tokenisation, masking, or encryption = how you achieve it
GDPR: Pseudonymisation as a Privacy Safeguard
The General Data Protection Regulation (GDPR) does not make pseudonymisation mandatory. However, it strongly encourages it as a technical and organisational safeguard to reduce risks during personal data processing.
What GDPR Says
- Article 4(5) defines pseudonymisation as the processing of personal data in such a manner that the data can no longer be attributed to a specific data subject without additional information kept separately and securely.
- Recital 28 promotes pseudonymisation as a useful security measure.
- Recital 26 clarifies that pseudonymised data remains personal data if re-identification is possible.
What It Means
Pseudonymisation replaces identifiers with reversible codes or pseudonyms. The link to the real identity is stored separately, allowing re-identification under controlled conditions.
This means pseudonymisation:
- Reduces risk, but does not eliminate it.
- Supports compliance, but does not remove obligations under GDPR.
- Is a privacy-by-design safeguard, not a data anonymisation tool.
In short, pseudonymisation is about enabling controlled processing without losing data utility, a delicate balance between privacy and functionality.
DPDPA: De-identification as a Compliance Safeguard
India’s Digital Personal Data Protection Act, 2023 takes a broader and more practical approach. It replaces the European term “pseudonymisation” with “de-identification.”
What DPDPA Says
Section 2(h) defines de-identification as:
“The process by which a data fiduciary or data processor may remove or mask identifiers from personal data, or replace them with such other fictitious name or code that is unique to an individual but does not, on its own, directly identify the data principal.”
What It Means
DPDPA’s de-identification concept is more flexible and inclusive, it covers several techniques, including masking, tokenisation, suppression, generalisation, and data perturbation.
However, DPDPA does not explicitly say that de-identified data ceases to be personal data. Unless data is irreversibly anonymised, it remains subject to the Act.
Why It Matters
DPDPA places the burden on data fiduciaries to use reasonable safeguards, and de-identification is one of them. Unlike GDPR, which treats pseudonymisation as a “best practice,” India’s law integrates it into the compliance framework itself, bridging technical control with legal obligation.
PCI DSS: Tokenisation as a Security Control
The Payment Card Industry Data Security Standard (PCI DSS) approaches the issue from a purely security perspective, not a privacy one.
What PCI DSS Says
Tokenisation is defined as:
“A process by which the primary account number (PAN) is replaced with a surrogate value (token).”
What It Means
Tokenisation removes sensitive payment data from operational systems and replaces it with random tokens that have no intrinsic meaning or exploitable value. The mapping between tokens and original data is securely maintained in a token vault.
If the tokens cannot be reversed without the vault, they are not considered cardholder data, effectively removing those systems from PCI DSS compliance scope.
Beyond Payment Data
While tokenisation began as a payment security technique, it is now used across sectors to protect any sensitive value from customer identifiers to digital assets.
When applied to personal data, tokenisation can serve as a de-identification technique under DPDPA, though its core purpose remains security, not privacy.
Conclusion
While the DPDPA mandates de-identification as a legal safeguard, it provides limited clarity on its technical execution. The GDPR, on the other hand, defines pseudonymisation clearly but still classifies such data as personal. PCI DSS enforces tokenisation to secure payment data but does not address privacy at all.
This regulatory divergence creates operational confusion. Privacy teams are often unsure whether their tokenisation solutions meet de-identification standards, or whether reversible pseudonymisation aligns with India’s expectations. The lack of unified guidance makes compliance a matter of interpretation rather than certainty.
Despite their shared objective of protecting data, the DPDPA, GDPR, and PCI DSS adopt distinct approaches, one mandates de-identification, another classifies pseudonymised data as personal, and the third enforces tokenisation for sensitive financial data. This divergence leaves organisations uncertain about compliance expectations and technical adequacy. Unless India issues detailed guidance clarifying these overlaps, privacy teams will continue to face operational ambiguity even while striving for full compliance.
