The projects were queried using the Github API, using the search qualifier
key = full Github project name (user ID/project ID)
If using our dataset, please refer to the following article under publication:
Zalán Bodó, Bipin Indurkhya. Software Categorization Using Low-Level Distributional Features. Accepted to SOMET 2017.