Interesting article from Bruce Schneier on Wired: Parts of the dataset of the Netflix prize have been de-anonymized. There have been other successful de-anonymizations of public datasets but most of the time it was caused by a sloppy anonymization process. This time it’s different. It is an inherent problem of the data, so even sophisticated randomization of the data would not have made a real difference: the data in the dataset of Netflix can be linked directly to user content on publicly available websites. This reveals the fundamental issue within; the data users leave willingly or unwillingly can be reassembled whether they like it or not.