The Dynamics of Topics in Code
In this project, we study the dynamics of various topics in code and their change. The general idea of the project is as follows: we gather a large corpus of code and make "temporal slices" in it, meaning that we extract its states at various points in the past. After that, we perform topic modeling on this data, which allows us not only to determine the topics in the code, but also track their changes in time. Additionally, we plan to use the inforation about contributors to study the dynamics of topics from their perspective: for example, are there any correlations suggesting that if a developer is active in topic A, they are likely to shift to topic B?
The repository of the project on GitHub.