Mugo Web main content.

Plagiarism scanner integrated into an editorial approval workflow

By: Thiago Campos Viana | March 6, 2017 | Business solutions, Web solutions, plagiarism scan, workflow, and plagscan

In our case study FindaTopDoc Prescribes eZ Publish for Healthy Content Management, we briefly covered our integration of PlagScan into the editorial approval workflow. When writing about medical topics, content -- especially medical term definitions -- can end up being duplicated on other sites, even if it was not purposely copied. Therefore, it is important for SEO reasons to ensure that all content on the FindaTopDoc site is as unique as possible. Here we'll take a closer look at how the plagiarism scanner integration works.

When dealing with a large group of writers, it can be time consuming to check the submitted articles for plagiarized text. On the web this can be practically impossible to do manually. Here's where a service like PlagScan comes in handy.

 

PlagScan Service

PlagScan Service

 

PlagScan is a web service that verifies the authenticity of documents. It accesses billions of documents to compare content, and quickly gives a score ranging from 0% (no plagiarism detected) to 100% (full copy and paste of text).

For FindaTopDoc, in order for an article to be published, we configured a rule that it cannot exceed a score of 20% duplicate copy.

PlagScan Report

PlagScan full report

Using the PlagScan API, Mugo integrated PlagScan seamlessly into the eZ Publish back-end. When a writer creates an article in the back-end, they can run the plagiarism scanner up to three times themselves before sending the article to an editor. The system sends the article text to PlagScan, and then the writer is able to check the score and a plagiarism report. Then he/she can decide to make more edits to the text or submit the article to an editor.

PlagScan run and score

Content creators can run PlagScan and check the score

PlagScan attribute

Content creators can also check PlagScan score and report while editing

The editor can then do a final review, before setting the status to "Ready for Scan".

PlagScan Workflow View

The pending view allows moderators to claim docs, review and then set the state to "Ready for Scan". It is also possible to check the plagiarism score and check the reports.

 

When the document is set to "Ready for Scan", the system will perform a final scan before publishing it. It will submit it to PlagScan one more time and then parse the response, checking the final score. If the score is equal to or below 20%, the article will be published. Otherwise, the article will be set to "Failed", meaning the writer and/or editor will need to update the article appropriately and re-submit it through the workflow.

The final result is a powerful, streamlined editorial workflow -- with special SEO considerations -- built on top of eZ Publish.