Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone who deals with HIPAA every day, it really bothers me that Change Healthcare is there even if the data is "anonymized".


I want to know how the hell they get it to begin with. CMS releases procedures/prescription data. But it is always totals by provider and it is provided 3 years in arrears. And that is only medicare data. How are they getting private pay information?


I'm not sure why it would bother you. I can generate random data in the same format and you can't figure out which is real or who's data it is. Thinking about data as if it has intrinsic value by being recorded is foolish, as many find out during acquisitions.


With another dataset that is non-anonymous, you can cross-verify with the anonymous data to increase your confidence of who they are. This happened when Netflix released anonymized user ratings. Researchers were able to deanonymize some users by using IMDB ratings.

https://www.wired.com/2007/12/why-anonymous-data-sometimes-i...


Similar thing happened a while back when AOL released 'anonymized' search data - https://en.wikipedia.org/wiki/AOL_search_data_leak

I dug through it just out of morbid curiosity and unfortunately stumbled upon a few searches indicating that a friend privately suffered a miscarriage. This was only possible because I recognized her as the only person that I knew at the intersection of a few other search terms that were correlated to the same 'anonymized' id.

GP is assuming a naive starting point when that is rarely the case.


Youre making up a mythical insecure map of useful info to worthless information. This is not a compelling reason to treat all anonymized data as useful or vulnerable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: