How to develop a Plagiarism detector?

2 Answers

The language is nearly irrelevant. Another questions exists that discusses this a bit more. Basically, the method suggested there is to use Google. Extract parts of the target-text, and search for them on Google.

115

answered Sep 28 '22 01:09

Sampson

I am making a plagiarism checker using Python as a hobby project. The following steps are to be followed:

Tokenize the document.
Remove all the stop words using NLTK library.
Use GenSim library and find the most relevant words, line by line. This can be done by creating the LDA or LSA of the document.
Use Google Search API to search for those words.

Note: you might have chosen to use the Google API and search the whole document at once. This will work when you are working with smaller amount of data. However when building plagiarism checker for sites and webscraped data, we will need to apply NLTK algorithms.

The Google search API will result in the top articles which have the same words which were resulted in the LDA or LSA from GenSim library functions of Python.

Hope it helped.

answered Sep 28 '22 00:09

Sumukh Bhandarkar

Related questions
                            
                                package folder already used in project netbeans
                            
                                Software Project Management systems [closed]
                            
                                C and Assembly project suggestion needed for class on low level software
                            
                                Netbeans runs all my projects
                            
                                Has using an acknowledged anti-pattern ever been proven to actually solve a problem, or be beneficial in any other way? [closed]
                            
                                What VS2010 C Project settings cause exes to require Compatibility Mode
                            
                                XCode - Editing xcodeproj bundle (specifically project.pbxproj)
                            
                                storing multiple credentials in tortoisesvn
                            
                                Structuring a winforms C# solution

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to develop a Plagiarism detector?

Tags:

projects

deovrat singh

People also ask

2 Answers

Sampson

Sumukh Bhandarkar

Recent Activity

Donate For Us