Research group

Machine Learning Methods in Software Engineering

The Dynamics of Topics in Code

Project supervisor: Timofey Bryksin
Status: Active

In this project, we study the dynamics of various topics in code and their change. The general idea of the project is as follows: we gather a large corpus of code and make "temporal slices" in it, meaning that we extract its states at various points in the past. After that, we perform topic modeling on thos data, which allows us to not only determine the topics in code, but also track their change through time. Additionally, we plan to use the inforation about the contributors to study the dynamics of topics from the standpoint of developers: for example, are there any correlations that if a developer is active in topic A, they are likely to shift to topic B?

The repository of project on GitHub.