JSIR is a next-generation JavaScript analysis tool. At its core is an MLIR-based high-level intermediate representation, which supports both dataflow analysis and lossless conversion back to source. This unique design makes it suitable for source-to-source transformation.
JSIR is used at Google for code analysis and transform use cases. For example:
-
Decompilation
JSIR is used for decompiling the Hermes bytecode all the way to JavaScript code, by utilizing its ability to be fully lifted back to source code.
-
Deobfuscation:
JSIR is used for deobfuscating JavaScript by utilizing its source-to-source transformation capability.
See our latest paper on how we combine the Gemini LLM and JSIR for deobfuscation.
Driven by the diverse use cases of JavaScript analyses and transformations, JSIR needs to achieve two seemingly conflicting goals:
-
It needs to be high-level enough to be lifted back to the AST, in order to support source-to-source transformation and decompilation.
-
It needs to be low-level enough to facilitate dataflow analysis, in order to support taint analysis, constant propagation, etc..
To achieve these goals, JSIR defines a high level IR that uses MLIR regions to accurately model control flow structures.
See intermediate_representation_design.md for details.
The easiest way to get started with JSIR is using Docker:
# Build the Docker image
docker build -t jsir:latest .
# Run jsir_gen
docker run --rm jsir:latest jsir_gen --help
# Analyze a JavaScript file
docker run --rm -v $(pwd):/workspace jsir:latest jsir_gen --input_file=/workspace/yourfile.jsWe have only tested clang on Linux:
# Install clang:
sudo apt update
sudo apt install clangWe use the Bazel build system. It is recommended to use Bazelisk to manage
Bazel versions:
# Install Bazelisk through npm:
sudo apt install npm
sudo npm install -g @bazel/bazeliskNote: The build takes a lot of storage space. If you run out of space, Bazel will return a cryptic error.
LLVM takes a long time to fetch and build. We can test if LLVM is properly included by building a part of it:
# This will fetch LLVM and build its support library:
bazelisk build @llvm-project//llvm:SupportTo build JSIR:
# Build everything:
bazelisk build //...
# Or, build a single target:
bazelisk build //maldoca/js/ir:jsir_gen
# Or, build all targets in a directory:
bazelisk build //maldoca/js/ir/...To run test cases:
# Run all tests:
bazelisk test //...
# Or, run a specific test:
bazelisk test //maldoca/js/quickjs:quickjs_test
# Or, run all tests under a directory:
bazelisk test //maldoca/js/ir/conversion/...Convert a JavaScript source file to JSHIR:
bazelisk run //maldoca/js/ir:jsir_gen --\
--input_file=$(pwd)/maldoca/js/ir/conversion/tests/if_statement/input.js \
--passes=source2ast,ast2hir-
Adversarial JavaScript Analysis with MLIR
Talk at LLVM Developers' Meeting 2024
-
CASCADE: LLM-Powered JavaScript Deobfuscator at Google
Paper about combining LLM + JSIR for JavaScript deobfuscation
This is not an official Google product.