20 20

Transactions on
Data Privacy
Foundations and Technologies


Articles in Press

Accepted articles here

Latest Issues

Year 2021

Volume 14 Issue 3
Volume 14 Issue 2
Volume 14 Issue 1

Year 2020

Volume 13 Issue 3
Volume 13 Issue 2
Volume 13 Issue 1

Year 2019

Volume 12 Issue 3
Volume 12 Issue 2
Volume 12 Issue 1

Year 2018

Volume 11 Issue 3
Volume 11 Issue 2
Volume 11 Issue 1

Year 2017

Volume 10 Issue 3
Volume 10 Issue 2
Volume 10 Issue 1

Year 2016

Volume 9 Issue 3
Volume 9 Issue 2
Volume 9 Issue 1

Year 2015

Volume 8 Issue 3
Volume 8 Issue 2
Volume 8 Issue 1

Year 2014

Volume 7 Issue 3
Volume 7 Issue 2
Volume 7 Issue 1

Year 2013

Volume 6 Issue 3
Volume 6 Issue 2
Volume 6 Issue 1

Year 2012

Volume 5 Issue 3
Volume 5 Issue 2
Volume 5 Issue 1

Year 2011

Volume 4 Issue 3
Volume 4 Issue 2
Volume 4 Issue 1

Year 2010

Volume 3 Issue 3
Volume 3 Issue 2
Volume 3 Issue 1

Year 2009

Volume 2 Issue 3
Volume 2 Issue 2
Volume 2 Issue 1

Year 2008

Volume 1 Issue 3
Volume 1 Issue 2
Volume 1 Issue 1

Volume 3 Issue 1

Communication-Efficient Privacy-Preserving Clustering

Geetha Jagannathan(a),(*), Krishnan Pillaipakkamnatt(b), Rebecca N. Wright(a), Daryl Umano(b)

Transactions on Data Privacy 3:1 (2010) 1 - 25

Abstract, PDF

(a) Department of Computer Science, Rutgers University, New Brunswick, NJ, USA.

(b) Department of Computer Science, Hofstra University, Hempstead, NY,USA.

e-mail:geetha @cs.rutgers.edu; csckzp @hofstra.edu; Rebecca.Wright @rutgers.edu; dumano33 @hotmail.com


The ability to store vast quantities of data and the emergence of high speed networking have led to intense interest in distributed data mining. However, privacy concerns, as well as regulations, often prevent the sharing of data between multiple parties. Privacy-preserving distributed data mining allows the cooperative computation of data mining algorithms without requiring the participating organizations to reveal their individual data items to each other.

This paper makes several contributions. First, we present a simple, deterministic, I/O-efficient kclustering algorithm that was designed with the goal of enabling an efficient privacy-preserving version of the algorithm. Our algorithm examines each item in the database only once and uses only sequential access to the data. Our experiments show that this algorithm produces cluster centers that are, on average, more accurate than the ones produced by the well known iterative k-means algorithm, and compares well against BIRCH.

Second, we present a distributed privacy-preserving protocol for k-clustering based on our new clustering algorithm. The protocol applies to databases that are horizontally partitioned between two parties. The participants of the protocol learn only the final cluster centers on completion of the protocol. Unlike most of the earlier results in privacy-preserving clustering, our protocol does not reveal intermediate candidate cluster centers. The protocol is also efficient in terms of communication and does not depend on the size of the database. Although there have been other clustering algorithms that improve on the k-means algorithm, ours is the first for which a communication efficient cryptographic privacy-preserving protocol has been demonstrated.

* Corresponding author.

Follow us


ISSN: 1888-5063; ISSN (Digital): 2013-1631; D.L.:B-11873-2008; Web Site: http://www.tdp.cat/
Contact: Transactions on Data Privacy; Vicenç Torra; U. of Skövde; PO Box 408; 54128 Skövde; (Sweden); e-mail:tdp@tdp.cat
Note: TDP's web site does not use cookies. TDP does not keep information neither on IP addresses nor browsers. For the privacy policy access here.


Vicenç Torra, Last modified: 00 : 25 December 12 2014.