Privacy-preserving tuple matching in distributed databases

Yingpeng Sang, Hong Shen, Hui Tian

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)


We address the problems of Privacy-Preserving Duplicate Tuple Matching (PPDTM) and Privacy-Preserving Threshold Attributes Matching (PPTAM) in the scenario of a horizontally partitioned database among N parties, where each party holds a private share of the database's tuples and all tuples have the same set of attributes. In PPDTM, each party determines whether its tuples have any duplicate on other parties' private databases. In PPTAM, each party determines whether all attribute values of each tuple appear at least a threshold number of times in the attribute unions. We propose protocols for the two problems using additive homomorphic cryptosystem based on the subgroup membership assumption, e.g., Paillier's and ElGamal's schemes. By analysis on the total numbers of modular exponentiations, modular multiplications and communication bits, with a reduced computation cost which dominates the total cost, by trading off communication cost, our PPDTM protocol for the semihonest model is superior to the solution derivable from existing techniques in total cost. Our PPTAM protocol is superior in both computation and communication costs. The efficiency improvements are achieved mainly by using random numbers instead of random polynomials as existing techniques for perturbation, without causing successful attacks by polynomial interpolations. We also give detailed constructions on the required zero-knowledge proofs and extend our two protocols to the malicious model, which were previously unknown.

Original languageEnglish
Article number4770102
Pages (from-to)1767-1782
Number of pages16
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number12
Publication statusPublished - Dec 2009
Externally publishedYes


  • Distributed database
  • Privacy preservation
  • Secure computation
  • Zero-knowledge proof


Dive into the research topics of 'Privacy-preserving tuple matching in distributed databases'. Together they form a unique fingerprint.

Cite this