20 20

Transactions on
Data Privacy
Foundations and Technologies

http://www.tdp.cat


Articles in Press

Accepted articles here

Latest Issues

Year 2016

Volume 9 Issue 3
Volume 9 Issue 2
Volume 9 Issue 1

Year 2015

Volume 8 Issue 3
Volume 8 Issue 2
Volume 8 Issue 1

Year 2014

Volume 7 Issue 3
Volume 7 Issue 2
Volume 7 Issue 1

Year 2013

Volume 6 Issue 3
Volume 6 Issue 2
Volume 6 Issue 1

Year 2012

Volume 5 Issue 3
Volume 5 Issue 2
Volume 5 Issue 1

Year 2011

Volume 4 Issue 3
Volume 4 Issue 2
Volume 4 Issue 1

Year 2010

Volume 3 Issue 3
Volume 3 Issue 2
Volume 3 Issue 1

Year 2009

Volume 2 Issue 3
Volume 2 Issue 2
Volume 2 Issue 1

Year 2008

Volume 1 Issue 3
Volume 1 Issue 2
Volume 1 Issue 1


Volume 5 Issue 2


A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining

Hongwei Tian(a), Weining Zhang(a),(*), Shouhuai Xu(a), Patrick Sharkey(a)

Transactions on Data Privacy 5:2 (2012) 433 - 467

Abstract, PDF

(a) Department of Computer Science, University of Texas at San Antonio.

e-mail:htian @cs.utsa.edu; wzhang @cs.utsa.edu; shxu @cs.utsa.edu; psharkey @cs.utsa.edu


Abstract

Privacy-preserving data mining (PPDM) is an important problem and is currently studied in three approaches: the cryptographic approach, the data publishing, and the model publishing. However, each of these approaches has some problems. The cryptographic approach does not protect privacy of learned knowledge models and may have performance and scalability issues. The data publishing, although is popular, may suffer from too much utility loss for certain types of data mining applications. The model publishing is lacking of efficient algorithms for practical use in a multiple data source environment.

In this paper, we present a knowledge model sharing based approach which learns a global knowledge model from pseudo-data generated according to anonymized knowledge models published by local data sources. Specifically, for the anonymization of knowledge models, we present two privacy measures for decision trees and an algorithm that obtains an anonymized decision tree by tree pruning. For the pseudo-data generation, we present an algorithm that generates useful pseudo-data from decision trees. We empirically study our method by comparing it with several PPDM methods that utilize existing techniques, including three methods that publish anonymized-data, one method that learns anonymized decision trees directly from the original-data, and one method that uses ensemble classification. Our results show that in both single data source and multiple data source environments and for several different datasets, predictive models, and utility measures, our method can obtain significantly better predictive models (especially, decision trees) than the other methods.

* Corresponding author.

Follow us




Supports





IIIA-CSIC




ISSN: 1888-5063; ISSN (Digital): 2013-1631; D.L.:B-11873-2008; Web Site: http://www.tdp.cat/
Contact: Transactions on Data Privacy; U. of Skövde; PO Box 408; 54128 Skövde; (Sweden); e-mail:tdp@tdp.cat

 


Vicenç Torra, Last modified: 10 : 39 June 27 2015.