Journals Information
Computer Science and Information Technology Vol. 3(3), pp. 76 - 80
DOI: 10.13189/csit.2015.030304
Reprint (PDF) (586Kb)
An Improvement of Plagiarized Area Detection System Using Jaccard Correlation Coefficient Distance Algorithm
Kwangho Song 1, Jihong Min 1, Gayoung Lee 1, Sang Chul Shin 2, Yoo-Sung Kim 1,*
1 Department of Information and Communication Engineering, Inha University, South Korea
2 Daolsoft Co., Ltd. 2F Won Seong Building, South Korea
ABSTRACT
In this paper, a plagiarized area detection system is proposed in which Jaccard correlation coefficient is used for filtering to improve the processing time against huge volume of documents. Hence, the proposed system does filter to efficiently detect plagiarized area against huge volume of original documents by two algorithms; Jaccard coefficient distance algorithm and Cosine distance algorithm. Since Jaccard coefficient distance algorithm computes the distance between two document based only on the existence of words while Cosine distance algorithm uses word's frequency also, Jaccard coefficient distance algorithm is faster than Cosine one. Hence, for the efficiency, we use Jaccard coefficient distance algorithm as the first filter. According to the experiment result of the performance comparison between the proposed system and the previous our system, the newly proposed system outperforms the previous one with about 30% reduced processing time.
KEYWORDS
Plagiarism, Jaccard Correlation Coefficient Distance, Filtering
Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Kwangho Song , Jihong Min , Gayoung Lee , Sang Chul Shin , Yoo-Sung Kim , "An Improvement of Plagiarized Area Detection System Using Jaccard Correlation Coefficient Distance Algorithm," Computer Science and Information Technology, Vol. 3, No. 3, pp. 76 - 80, 2015. DOI: 10.13189/csit.2015.030304.
(b). APA Format:
Kwangho Song , Jihong Min , Gayoung Lee , Sang Chul Shin , Yoo-Sung Kim (2015). An Improvement of Plagiarized Area Detection System Using Jaccard Correlation Coefficient Distance Algorithm. Computer Science and Information Technology, 3(3), 76 - 80. DOI: 10.13189/csit.2015.030304.