TY - GEN
T1 - Achieving probabilistic anonymity against one-to-multiple linkage attacks
AU - Sang, Yingpeng
AU - Shen, Hong
AU - Tian, Hui
AU - Zhang, Zonghua
PY - 2013
Y1 - 2013
N2 - Randomization methods widely applied for privacy-preserving data mining are generally subject to reconstruction attack, linkage attack, and semantic-related attacks. A probabilistic anonymity definition has been proposed in [1] to defend against the linkage attack in which the attacker links the same randomized record to all of the original records. In this paper we name this type of attack as Multiple (original records) to One (randomized record) attack, while focus on another attack that has not been researched before, i.e. One (original record) to Multiple (randomized records) attack. The latter is different from the former in that it does not require the attacker to know the distribution and all values of quasi-identifiers in original records, and thus is easier to be launched by the attacker. To defend against this attack we propose a novel probabilistic anonymity concept different from [1]. We achieve this anonymity goal on a hybrid model combining random projection and random noise addition. We also analyze the security properties of this model against the other common types of attacks. Compared with existing work in randomization, k-anonymity and differential privacy, our work achieves the holistic aim of higher security, higher efficiency and higher data utility, and demonstrates very promising applications in large-scale and high-dimensional data mining in clouds.
AB - Randomization methods widely applied for privacy-preserving data mining are generally subject to reconstruction attack, linkage attack, and semantic-related attacks. A probabilistic anonymity definition has been proposed in [1] to defend against the linkage attack in which the attacker links the same randomized record to all of the original records. In this paper we name this type of attack as Multiple (original records) to One (randomized record) attack, while focus on another attack that has not been researched before, i.e. One (original record) to Multiple (randomized records) attack. The latter is different from the former in that it does not require the attacker to know the distribution and all values of quasi-identifiers in original records, and thus is easier to be launched by the attacker. To defend against this attack we propose a novel probabilistic anonymity concept different from [1]. We achieve this anonymity goal on a hybrid model combining random projection and random noise addition. We also analyze the security properties of this model against the other common types of attacks. Compared with existing work in randomization, k-anonymity and differential privacy, our work achieves the holistic aim of higher security, higher efficiency and higher data utility, and demonstrates very promising applications in large-scale and high-dimensional data mining in clouds.
KW - Data mining
KW - Differential privacy
KW - K-anonymity
KW - Randomization
UR - http://www.scopus.com/inward/record.url?scp=84893216357&partnerID=8YFLogxK
U2 - 10.1109/ICEBE.2013.27
DO - 10.1109/ICEBE.2013.27
M3 - Conference contribution
AN - SCOPUS:84893216357
SN - 9780769551111
T3 - Proceedings - 2013 IEEE 10th International Conference on e-Business Engineering, ICEBE 2013
SP - 176
EP - 183
BT - Proceedings - 2013 IEEE 10th International Conference on e-Business Engineering, ICEBE 2013
PB - IEEE Computer Society
T2 - 2013 IEEE 10th International Conference on e-Business Engineering, ICEBE 2013
Y2 - 11 September 2013 through 13 September 2013
ER -