Data Anonymization Risks

The cluster discusses the challenges and often impossibility of truly anonymizing data, with commenters arguing that anonymized datasets can be deanonymized through correlations with other data sources, citing examples like the AOL search data leak.

📉 Falling 0.4x Security
3,269
Comments
20
Years Active
5
Top Authors
#1883
Topic ID

Activity Over Time

2007
3
2008
11
2009
23
2010
48
2011
56
2012
64
2013
106
2014
132
2015
101
2016
157
2017
248
2018
336
2019
382
2020
344
2021
268
2022
254
2023
312
2024
201
2025
216
2026
7

Keywords

US DEA arstechnica.com OK WSOP ZIP ECG readthedocs.io cipheredtrust.com wikipedia.org anonymized data identify prod dataset fingerprints information location dump anonymous

Sample Comments

pmoriarty Mar 14, 2023 View on HN

Data that has ostensibly been "anonymized" can often be deanonymized.

daenney Jan 8, 2017 View on HN

How so privacy? This data is anonymised, they could do the same.

yellowapple May 31, 2020 View on HN

Should be possible to anonymize the data, no?

mmanfrin May 9, 2017 View on HN

You can't really anonymize data:https://en.wikipedia.org/wiki/AOL_search_data_leak

jolfdb May 23, 2019 View on HN

It's impossible to anonymize; that's the point.

noir_lord Jul 12, 2018 View on HN

As long as you anonymise in a way that you can't de-anonymise it should be OK.

kmlx May 23, 2019 View on HN

as long as that information is anonymised (and impossible to de-anonymise), i don't see the problem...

TeMPOraL May 25, 2020 View on HN

Related: there's no such thing as "anonymized data", there's only "anonymized until correlated with enough other datasets".

conradev Apr 13, 2016 View on HN

Do you know to what degree the data is anonymized?

Aerroon Nov 14, 2025 View on HN

There's no such thing. Anonymized data can still be used to identify someone as we've seen on numerous occasions.