Analyzes the git structure of a project to find the best configuration of co-change detection.
Also analyzes output from Market Basket Analysis, Dynamic Time Warping and CoCo.
Utilizes smells reported by https://github.com/darius-sas/astracker (ASTracker)
Related to https://github.com/RonaldKruizinga/CoSmellingChanges (CoCo)
The requirements to use this script are:
- Python 3.7
- Pip
The source code for this project can be cloned using Git or downloaded from Github.
The required packages can be installed via the included Pipfile.
The analysis can be run in two ways. The first is by running Main.py. This file has a few different methods referenced for various purposes.
-
hyper_param_analysis()andthreshold_distribution()are used to generate hyperparameters for CoCo.threshold_distribution()requires CoCo to have run with a threshold of 1 first, so that it can calculate the distribution. -
generate_analysis()is used to run the market basket analysis and dynamic time warping -
run_exploration()runs a data exploration on all three algorithm, also using the output from ASTracker. For large projects, this should be done on a server cluster with a significant amount of memory. Example scripts for this can be found in theexecution_scriptsfolder. These examples are used to run the project on the Peregrine Cluster of the University of Groningen. -
results_analysis()analyses the results and provides some interesting plots.
In order to use the application to analyse hyperparameters for a project, certain properties in the config.py, located in the main directory, need to be set.
An example is left in the config for the Sonarlint-IntelliJ project.
- A
github api keyis required in order to recover the dates of analyzed commits - A
start and end dateof the analysis are required as a time range. - Project information, such as
name, url, branch and ownerare required - Directories, such as
input, output and cloningdirectories can be configured
As input the application can require the output from CoCo and ASTracker.
The system outputs the co-changes detected using the MBA and DTW algorithms, as well as statistics regarding the overlap of smells with co-changes.
helper_scriptsprovide utility functionshyperparameter_analysisprovides all necessary functions for that purposemodelcontains the classes definedresults_analysiscontains code for plotting interesting plots.