TY - JOUR

T1 - Practical anonymity models on protecting private weighted graphs

AU - Li, Yidong

AU - Shen, Hong

AU - Lang, Congyan

AU - Dong, Hairong

N1 - Publisher Copyright:
© 2016 Elsevier B.V.

PY - 2016/12/19

Y1 - 2016/12/19

N2 - Identity disclosure control (IDC) on graph data has attracted increasing interest in security and database communities. Most existing work focuses on preventing identity disclosure derivable from certain structural information in unweighted graphs. In weighted graphs, when the weight of an edge implying relevance/association between its adjacency vertices is taken into account, this problem becomes more complex due to the diversity of weight-related information which may expose to many types of background knowledge attacks and hence significantly increases the time complexity for preventing privacy breaches. This paper systematically studies IDC in weighted graphs, which has no known solution to our knowledge, by employing elementary weight invariants as background knowledge. We propose a general anonymity model against weight-related attacks, and introduce a new utility metric based on spectral graph theory. Then we distinguish two types of practical breaches, namely volume and histogram attack, which the adversary has the knowledge of the sum and the set of adjacent weights for each vertex respectively. We propose an efficient method for volume anonymization, and a heuristic scheme for histogram anonymization which we show to be NP-hard. We show how to construct the graph under these anonymized properties to protect a graph from both attacks. Our approaches are effective in terms of efficiency and data utility preservation: run in near-quadratic time on graph size, and preserve a similar utility as the original graph. The performances of the algorithms have been validated by extensive experiments on both synthetic and real-world datasets.

AB - Identity disclosure control (IDC) on graph data has attracted increasing interest in security and database communities. Most existing work focuses on preventing identity disclosure derivable from certain structural information in unweighted graphs. In weighted graphs, when the weight of an edge implying relevance/association between its adjacency vertices is taken into account, this problem becomes more complex due to the diversity of weight-related information which may expose to many types of background knowledge attacks and hence significantly increases the time complexity for preventing privacy breaches. This paper systematically studies IDC in weighted graphs, which has no known solution to our knowledge, by employing elementary weight invariants as background knowledge. We propose a general anonymity model against weight-related attacks, and introduce a new utility metric based on spectral graph theory. Then we distinguish two types of practical breaches, namely volume and histogram attack, which the adversary has the knowledge of the sum and the set of adjacent weights for each vertex respectively. We propose an efficient method for volume anonymization, and a heuristic scheme for histogram anonymization which we show to be NP-hard. We show how to construct the graph under these anonymized properties to protect a graph from both attacks. Our approaches are effective in terms of efficiency and data utility preservation: run in near-quadratic time on graph size, and preserve a similar utility as the original graph. The performances of the algorithms have been validated by extensive experiments on both synthetic and real-world datasets.

KW - Anonymity

KW - Privacy preserving graph mining

KW - Weight anonymization

KW - Weighted graph

UR - http://www.scopus.com/inward/record.url?scp=84994111931&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2016.08.084

DO - 10.1016/j.neucom.2016.08.084

M3 - Article

AN - SCOPUS:84994111931

SN - 0925-2312

VL - 218

SP - 359

EP - 370

JO - Neurocomputing

JF - Neurocomputing

ER -