Research group

Machine Learning Methods in Software Engineering


Timofey BryksinActive

In this project, we develop and build AntiCopyPaster — an IntelliJ plugin that tracks the pasting of code inside the IDE and suggests appropriate Extract Method refactorings to combat the propagation of duplicates. To implement the plugin, we gathered a dataset of code fragments that should and should not be extracted, compiled a list of metrics of code that can influence the decision, and trained several popular classifying machine learning models, of which Random Forest showed the best results. When a developer pastes a code fragment, the plugin searches for duplicates in the currently opened file. If there are any, it waits for a short period of time to allow the developer to edit the code. If the code duplicates are still present after a delay, AntiCopyPaster calculates the metrics for the fragment and inferences the decision: if the fragment should be extracted, the plugin suggests to refactor it.

The plugin is available on GitHub.



AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE

October 2022

Eman Abdullah AlOmar, Anton Ivanov, Zarina Kurbatova, Yaroslav Golubev, Mohamed Wiem Mkaouer, Ali Ouni, Timofey Bryksin, Le Nguyen, Amit Kini, and Aditya Thakur

Read more