Recent advances in telecommunications and database systems have allowed the scientific community to efficiently mine vast amounts of information worldwide and to extract new knowledge by discovering hidden patterns and correlations. Nevertheless, all this shared information can be used to invade the privacy of individuals through the use of fusion and mining techniques. Simply removing direct identifiers such as name, SSN, or phone number is not anymore sufficient to prevent against these practices. In numerous cases, other fields, like gender, date of birth and/or zipcode, can be used to re-identify individuals and to expose their sensitive details, e.g. their medical conditions, financial statuses and transactions, or even their private connections. The scope of this work is to provide an in-depth overview of the current state of the art in Privacy-Preserving Data Publishing (PPDP) for relational data. To counter information leakage, a number of data anonymisation methods have been proposed during the past few years, including $k$ -anonymity, $\ell $ -diversity, $t$ -closeness, to name a few. In this study we analyse these methods providing concrete examples not only to explain how each of them works, but also to facilitate the reader to understand the different usage scenarios in which each of them can be applied. Furthermore, we detail several attacks along with their possible countermeasures, and we discuss open questions and future research directions.
Journal: IEEE Access.
Date of Publication: 11 March 2020.