TY - GEN
T1 - Hypergraph partition with harmonic average top-N and PCA for Topic Detection
AU - Liu, Xinyue
AU - Ma, Fenglong
AU - Lin, Hongfei
AU - Shen, Hong
PY - 2010
Y1 - 2010
N2 - An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms' weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
AB - An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms' weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
UR - http://www.scopus.com/inward/record.url?scp=79952565096&partnerID=8YFLogxK
U2 - 10.1109/PAAP.2010.38
DO - 10.1109/PAAP.2010.38
M3 - Conference contribution
AN - SCOPUS:79952565096
SN - 9780769543123
T3 - Proceedings - 3rd International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2010
SP - 269
EP - 276
BT - Proceedings - 3rd International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2010
T2 - 3rd International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2010
Y2 - 18 December 2010 through 20 December 2010
ER -