Research group

Machine Learning Methods in Software Engineering

Code Clone Detection

Timofey BryksinActive

The project is dedicated to improving lexical methods of clone detection in code. The approach that is proposed in the project can be applied to any token-based tools: it consists of running the search with various parameters and merging the results together. The necessary parameters are estimated and the method is evaluated on two token-based clone detection tools — SourcererCC and CloneWorks.

Modified version of SourcererCC on GitHub.

The developed approach is also employed for the complex plagiarism study of Java code on GitHub.



Multi-Objective Optimization for Token-Based Clone Detection

February 2020

Yaroslav Golubev, Viktor Poletansky, Nikita Povarov, Timofey Bryksin

Read more