20 20

Transactions on
Data Privacy
Foundations and Technologies

http://www.tdp.cat


Articles in Press

Accepted articles here

Latest Issues

Year 2014

Volume 7 Issue 1

Year 2013

Volume 6 Issue 3
Volume 6 Issue 2
Volume 6 Issue 1

Year 2012

Volume 5 Issue 3
Volume 5 Issue 2
Volume 5 Issue 1

Year 2011

Volume 4 Issue 3
Volume 4 Issue 2
Volume 4 Issue 1

Year 2010

Volume 3 Issue 3
Volume 3 Issue 2
Volume 3 Issue 1

Year 2009

Volume 2 Issue 3
Volume 2 Issue 2
Volume 2 Issue 1

Year 2008

Volume 1 Issue 3
Volume 1 Issue 2
Volume 1 Issue 1


Volume 1 Issue 1


Generating Sufficiency-based Non-Synthetic Perturbed Data

Krishnamurty Muralidhar(a),(*), Rathindra Sarathy(b)

Transactions on Data Privacy 1:1 (2008) 17 - 33

Abstract, PDF

(a) School of Management; Gatton College of Business and Economics; University of Kentucky Lexington KY 40506 USA; e-mail: krishm@uky.edu

(b) Department of Management Science & Information Systems; Spears School of Business; Oklahoma State University, Stillwater OK 74073 USA; e-mail: Sarathy@okstate.edu


Abstract

The mean vector and covariance matrix are sufficient statistics when the underlying distribution is multivariate normal. Many type of statistical analyses used in practice rely on the assumption of multivariate normality (Gaussian model). For these analyses, maintaining the mean vector and covari-ance matrix of the masked data to be the same as that of the original data implies that if the masked data is analyzed using these techniques, the results of such analysis will be the same as that using the original data. For numerical confidential data, a recently proposed perturbation method makes it possi-ble to maintain the mean vector and covariance matrix of the masked data to be exactly the same as the original data. However, as it is currently proposed, the perturbed values from this method are consid-ered synthetic because they are generated without considering the values of the confidential variables (and are based only on the non-confidential variables). Some researchers argue that synthetic data re-sults in information loss. In this study, we provide a new methodology for generating non-synthetic perturbed data that maintains the mean vector and covariance matrix of the masked data to be exactly the same as the original data while offering a selectable degree of similarity between original and per-turbed data.

* Corresponding author.

Follow us at




Sponsors



IIIA-CSIC




ISSN: 1888-5063; ISSN (Digital): 2013-1631; D.L.:B-11873-2008; Web Site: http://www.tdp.cat/
Contact: Transactions on Data Privacy; IIIA-CSIC; Campus UAB s/n; 08193-Bellaterra; (Catalonia, Spain); e-mail:tdp@iiia.csic.es

 

IIIA - Institut d'Investigació en Intel·ligència Artificial


Vicenç Torra, Last modified: 16 : 59 June 22 2010.