This problem reminds me of the way an attacker might de-anonymize data (or at least, find the original data) from summary statistics described in this YouTube video which was made in partnership with the US census. In a given data set, there might be different sets of categorical variables that uniquely identify the data. Can we recover a single, most plausible set of variables that uniquely identify the data without having some other meaning?
-
Login or Register
- Log in with
Comment