Semantic text alignment based on topic modeling

2016

Abstract

The development of Internet makes plagiarism problem more and more serious. Plagiarism can be in different types, ranging from copying texts to adopting ideas, without giving credit to the original author. Most research in plagiarism checking concentrate on string matching. This method cannot deal with intelligent plagiarism in which the same content can be expressed by different ways. To deal with this problem, this paper proposes an approach to semantic text alignment based on sentence-level topic modeling. Experiments with PAN corpora gave us much higher recalls and approximate plagdets compared to the winning system in PAN2014. It shows that topic modeling is a potential solution for detecting intelligent plagiarism.

Lâm Phạm hasn't uploaded this paper.

Let Lâm know you want this paper to be uploaded.

Ask for this paper to be uploaded.

Log In

Semantic text alignment based on topic modeling

Related topics