C-safety: a framework for the anonymization of semantic trajectories
Anna Monreale(a),(*), Roberto Trasarti(b), Dino Pedreschi(a), Chiara Renso(b), Vania Bogorny(c)
Transactions on Data Privacy 4:2 (2011) 73 - 101
(a) KDD-Lab, University of Pisa, Italy.
(b) KDD-Lab, ISTI CNR, Pisa, Italy.
(c) UFSC, Florianopolis, SC, Brazil.
e-mail:annam @di.unipi.it; roberto.trasarti @isti.cnr.it; pedre @di.unipi.it; chiara.renso @isti.cnr.it; vania @inf.ufsc.br
The increasing abundance of data about the trajectories of personal movement is opening new opportunities for analyzing and mining human mobility. However, new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as sequences of places visited by a person during her/his movements - semantic trajectory - poses great privacy threats. In this paper we propose a privacy model defining the attack model of semantic trajectory linking and a privacy notion, called c-safety based on a generalization of visited places based on a taxonomy. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of non-sensitive places, has also visited any sensitive location. Coherently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on two real-life GPS trajectory datasets to show how our algorithm preserves interesting quality/utility measures of the original trajectories, when mining semantic trajectories sequential pattern mining results. We also empirically measure how the probability that the attacker's inference succeeds is much lower than the theoretical upper bound established.