CISA is an LLVM-based IR static analysis framework supporting an incremental analysis over
the git commit history.
The basic philosophy is to do costly static analyses (e.g., indirect call graph analysis) incrementally while scanning through the commit history. Every analysis is partially done and updated at the commit-modified parts (hence incremental) and, like the LLVM IR passes, can refer to the result of other analyses.
It is still in its infancy and only supports limited stuff (e.g., analyses can only refer to the call graph analysis, not other custom ones). If anybody reads this, I welcome any contribution.
As the introduction mentions, CISA aims to only analyze changed parts from commits. To do so, CISA scans the commit history within a given range in chronological order and, given the changed entity X by the current commit (e.g., changed function or module), it updates the analysis in the changed part first and then aggregates the up-to-date analysis results. For this, CISA requires custom analyzers for the following two callbacks: Update(X) and Aggregate(X).
Update(X): update the analysis for the changed entityX. This only updates the analysis insideX.Aggregate(X): aggregate the up-to-date analysis result for the changed entityX. This assembles the analysis done byUpdateand produces the final analysis result.Aggregateis always called after every possibleUpdatehas been called first, so it's safe to assume all entities in the source code have up-to-date analysis states.
The following is what developing and using a custom analyzer would look like.
- Write a custom analyzer (in
src/analyzer) that implementsUpdateandAggregate. - Build again (
$ make). - Run the CISA front-end (
$ ./cisa <repo_path> -o <out_path>).- For each commit from the beginning to the end, CISA calls
Updatewith all changed entities first and callsAggregatenext.
- For each commit from the beginning to the end, CISA calls
- Inspect the printed analysis result in
<out_path>.
- Integrated call graph analysis [MLTA, CCS'19]
- Nice C++ interface for custom function-level analyses
- LLVM 15.0.5
- Python 3.8.0+
- CMake 3.16.3+
- Some python packages: gitpython, termcolor, alive_progress
- Install prerequisites. (assuming Ubuntu 20.04+)
- Make sure that
pythonispython3andpipispip3.
- Make sure that
$ sudo apt install python3 python3-pip python-is-python3 cmake
$ sudo pip install gitpython termcolor alive_progress
- Decompress the prebuilt LLVM 15 binary to
llvmat the root.- Or you can create a symlink
llvmto the LLVM install directory (if you built LLVM on your own).
- Or you can create a symlink
$ # example: assuming Ubuntu 20.04+. at the root directory.
$ wget https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.5/clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04.tar.xz
$ tar -xvf clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04.tar.xz
$ rm clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04.tar.xz
$ mv clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04 llvm
- Make.
$ make # at the root directory.
See this page for a dockerized setting.
script: CISA front-end scripts (Python)src: CISA back-end code (C++)analyzer: where custom analyzers residecallgraph: incremental call graph analysis (MLTA)
extern: external dependencies
- Supporting references to LLVM objects (e.g.,
Function) in custom analyses - Supporting custom module-level analyses
- Converting the integrated call graph analysis to a custom module-level analysis
- Supporting custom analysis inter-operability
- Improving initial checkout delay