Personalized Anonymization for Set-Valued Data by Partial Suppression
Takuma Nakagawa(a),(b),(*), Hiromi Arai(c),(d), Hiroshi Nakagawa(c)
Transactions on Data Privacy 11:3 (2018) 219 - 237
(a) The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan.
(b) NS Solutions Corporation, 27-1, Shinkawa 2-chome, Chuo-ku, Tokyo, 104-0033, Japan.
(c) RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan.
(d) JST PRESTO, Gobancho, Chiyoda-ku, Tokyo, 102-0076, Japan.
e-mail:takuma.nakagawa0725 @gmail.com; ;
Set-valued data is comprised of records that are sets of items, such as goods purchased by each individual. Methods of publishing and widely utilizing set-valued data while protecting personal information have been extensively studied in the field of privacy-preserving data publishing. Until now, basic models such as k-anonymity or k m -anonymity could not cope with attribute inference by an adversary with background knowledge of the records. On the other hand, the ρ-uncertainty model makes it possible to prevent attribute inference with a confidence value above a certain level in set-valued data. However, even in that case, there is the problem that items to be protected have to be designated as common to everyone. In this research, we propose a new model that can provide more suitable privacy protection for each individual by protecting different items designated for each record distinctively and build a heuristic algorithm to achieve this guarantee using partial suppression. In addition, considering the problem that the computational complexity of the algorithm increases combinatorially with increasing data size, we introduce the concept of probabilistic relaxation of privacy guarantee. Finally, we show the experimental results of evaluating the performance of the algorithms using real-world datasets.