TY - GEN
T1 - Preventing identity disclosure in hypergraphs
AU - Li, Yidong
AU - Shen, Hong
PY - 2011
Y1 - 2011
N2 - Data publishing based on hypergraphs is becoming increasingly popular due to its power in representing multi-relations among objects. However, security issues have been little studied on this subject, while most recent work only focuses on the protection of relational data or graphs. As a major privacy breach, identity disclosure reveals the identification of entities with certain background knowledge known by an adversary. In this paper, we first introduce a novel background knowledge attack model based on the property of hyperedge ranks, and formalize the rank-based hypergraph anonymization problem. We then propose a complete solution in a two-step framework: rank anonymization and hypergraph construction. We also take hypergraph clustering (known as community detection) as data utility into consideration, and discuss two metrics to quantify information loss incurred in the perturbation. Our approaches are effective in terms of efficacy, privacy and utility. The algorithms run in near-quadratic time on hypergraph size, and protect data from rank attacks with almost same utility preserved. The performances of the methods have been validated by extensive experiments on real-world datasets as well. Our rank-based attack model and algorithms for rank anonymization and hypergraph construction are, to our best knowledge, the first systematic study to privacy preserving for hypergraph-based data publishing.
AB - Data publishing based on hypergraphs is becoming increasingly popular due to its power in representing multi-relations among objects. However, security issues have been little studied on this subject, while most recent work only focuses on the protection of relational data or graphs. As a major privacy breach, identity disclosure reveals the identification of entities with certain background knowledge known by an adversary. In this paper, we first introduce a novel background knowledge attack model based on the property of hyperedge ranks, and formalize the rank-based hypergraph anonymization problem. We then propose a complete solution in a two-step framework: rank anonymization and hypergraph construction. We also take hypergraph clustering (known as community detection) as data utility into consideration, and discuss two metrics to quantify information loss incurred in the perturbation. Our approaches are effective in terms of efficacy, privacy and utility. The algorithms run in near-quadratic time on hypergraph size, and protect data from rank attacks with almost same utility preserved. The performances of the methods have been validated by extensive experiments on real-world datasets as well. Our rank-based attack model and algorithms for rank anonymization and hypergraph construction are, to our best knowledge, the first systematic study to privacy preserving for hypergraph-based data publishing.
KW - Anonymization
KW - Community detection
KW - Hypergraph clustering
KW - Identity disclosure
KW - Private data publishing
UR - http://www.scopus.com/inward/record.url?scp=84863170233&partnerID=8YFLogxK
U2 - 10.1109/ICDMW.2011.139
DO - 10.1109/ICDMW.2011.139
M3 - Conference contribution
AN - SCOPUS:84863170233
SN - 9780769544090
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 659
EP - 665
BT - Proceedings - 11th IEEE International Conference on Data Mining Workshops, ICDMW 2011
T2 - 11th IEEE International Conference on Data Mining Workshops, ICDMW 2011
Y2 - 11 December 2011 through 11 December 2011
ER -