Equi-width data swapping for private data publication

Yidong Li, Hong Shen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Data Swapping is a popular value-invariant data perturbation technique. The quality of a data swapping method is measured by how well it preserves data privacy and data utility. As swapping data globally is computationally impractical, to guarantee its performance in these metrics appropriate, localization schemes are often conducted in advance. Equi-depth partitioning is preferred by most of the existing data perturbation techniques as it provides uniform privacy protection for each data tuple. However, this method performs ineffectively for two types of applications: one is to maintain statistics based on equi-width partitioning, such as the multivariate histogram with equal bin width, and the other Is to preserve parametric statistics, such as covariance, in the context of sparse data with non-uniform distribution. As a natural solution for the above application, this paper explores the possibility of using data swapping with equi-width partitioning for private data publication, which has been little used in data perturbation due to the difficulty of preserving data privacy. With extensive theoretical analysis and experimental results, we show that, Equi-Width Swapping (EWS) can achieve a similar performance in privacy preservation to that of Equi-Depth Swapping (EDS) if the number of partitions is sufficiently large (e.g. ≥ √N, where N is the size of dataset). Our experimental results in both synthetic and real-world data validate our theoretical analysis.

Original languageEnglish
Title of host publication2009 International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2009
Pages231-238
Number of pages8
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event2009 International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2009 - Higashi, Hiroshima, Japan
Duration: 8 Dec 200911 Dec 2009

Publication series

NameParallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings

Conference

Conference2009 International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2009
Country/TerritoryJapan
CityHigashi, Hiroshima
Period8/12/0911/12/09

Fingerprint

Dive into the research topics of 'Equi-width data swapping for private data publication'. Together they form a unique fingerprint.

Cite this