20 20

Transactions on
Data Privacy
Foundations and Technologies


Articles in Press

Accepted articles here

Latest Issues

Year 2023

Volume 16 Issue 3
Volume 16 Issue 2
Volume 16 Issue 1

Year 2022

Volume 15 Issue 3
Volume 15 Issue 2
Volume 15 Issue 1

Year 2021

Volume 14 Issue 3
Volume 14 Issue 2
Volume 14 Issue 1

Year 2020

Volume 13 Issue 3
Volume 13 Issue 2
Volume 13 Issue 1

Year 2019

Volume 12 Issue 3
Volume 12 Issue 2
Volume 12 Issue 1

Year 2018

Volume 11 Issue 3
Volume 11 Issue 2
Volume 11 Issue 1

Year 2017

Volume 10 Issue 3
Volume 10 Issue 2
Volume 10 Issue 1

Year 2016

Volume 9 Issue 3
Volume 9 Issue 2
Volume 9 Issue 1

Year 2015

Volume 8 Issue 3
Volume 8 Issue 2
Volume 8 Issue 1

Year 2014

Volume 7 Issue 3
Volume 7 Issue 2
Volume 7 Issue 1

Year 2013

Volume 6 Issue 3
Volume 6 Issue 2
Volume 6 Issue 1

Year 2012

Volume 5 Issue 3
Volume 5 Issue 2
Volume 5 Issue 1

Year 2011

Volume 4 Issue 3
Volume 4 Issue 2
Volume 4 Issue 1

Year 2010

Volume 3 Issue 3
Volume 3 Issue 2
Volume 3 Issue 1

Year 2009

Volume 2 Issue 3
Volume 2 Issue 2
Volume 2 Issue 1

Year 2008

Volume 1 Issue 3
Volume 1 Issue 2
Volume 1 Issue 1

Volume 16 Issue 3

Optimizing Privacy and Data Utility: Metrics and Strategies

Clémence Mauger(a), Gaël Le Mahec(a),(*), Gilles Dequen(a)

Transactions on Data Privacy 16:3 (2023) 153 - 189

Abstract, PDF

(a) Universit&ecuate; de Picardie Jules Verne - MIS Laboratory, 33 rue Saint-Leu, Amiens, 80000, France.

e-mail:clemence.mauger @u-picardie.fr; gael.le.mahec @u-picardie.fr; gilles.dequen @u-picardie.fr


k-anonymity is a PPDP anonymization model preventing identity disclosure by making each record of the table indistinguishable from k − 1 others. To obtain a k-anonymous version of a table, a common technique is to generalize the quasi-identifier attributes values until records are grouped in equivalence classes of size at least k. The choice of records to be grouped will influence the amount of generalization to be performed and therefore the quality of the anonymized data (the more a value is generalized, the more precision it loses). The different k-anonymous versions of a table are therefore more or less interesting in terms of data utility. To assess the quality of a k-anonymized table, information loss metrics are often used. They can also be used within the k-anonymization process itself to choose the groupings of records resulting in the least data alteration. In this article, we propose a unified modeling of such metrics, faciliting their implementation and their use. We then analyze the behaviors of seven metrics when they are used in the k-anonymization process to guide the equivalence classes mergings. Our analyzes compare these seven metrics on two public tables for 14 values of k. After that, we turned to the limits of k-anonymity. In a k-anonymous table, the distribution of sensitive values in equivalence classes can lead to the disclosure of sensitive information about an individual. l-diversity and t-closeness anonymization models impose constraints that keep control over the distribution of sensitive values and therefore limit attribute disclosure. We continue our study on k-anonymization by proposing strategies aimed at optimizing the data alteration, the l-diversity and the t-closeness of the k-anonymous tables produced. Using two information loss metrics, we evaluate the seven optimization strategies on the two public tables first on real sensitive values distributions and then on 21 simulated sensitive values distributions. With this large study, we would like to understand how to choose a metric and an optimization strategy to provide k-anonymous database with strong guarantees on the data privacy and preserving as much as possible the data utility.

* Corresponding author.

Follow us


ISSN: 1888-5063; ISSN (Digital): 2013-1631; D.L.:B-11873-2008; Web Site: http://www.tdp.cat/
Contact: Transactions on Data Privacy; Vicenç Torra; U. of Skövde; PO Box 408; 54128 Skövde; (Sweden); e-mail:tdp@tdp.cat
Note: TDP's web site does not use cookies. TDP does not keep information neither on IP addresses nor browsers. For the privacy policy access here.


Vicenç Torra, Last modified: 10 : 38 July 04 2023.