Code Clone Detection
The project is dedicated to improving lexical methods of clone detection in code. The approach that is proposed in the project can be applied to any token-based tools: it consists of running the search with various parameters and merging the results together. The necessary parameters are estimated and the method is evaluated on two token-based clone detection tools — SourcererCC and CloneWorks.
Modified version of SourcererCC on GitHub.
The developed approach is also employed for the complex plagiarism study of Java code on GitHub.
Multi-Objective Optimization for Token-Based Clone Detection
Yaroslav Golubev, Viktor Poletansky, Nikita Povarov, Timofey Bryksin