LLVM Crash Course
Classical Compiler Approach
Lexing
Parsing
Type Checking
Time/Space
Improvements
Instruction Selection
Register Allocation
Instruction Scheduling
C, C++, Obj-C
Source Code
Frontend
Language-Specific
AST
Optimizer
Backend
Machine
Code
Fortran
Source Code
Frontend
Language-Specific
AST
Optimizer
Backend
Machine
Code
LLVM Approach
Your Code
(C, C++,
Obj-C)
Frontend
(Clang)
Your Code
(Python)
Frontend
(Python)
Your Code
(Fortran)
Frontend
(Fortran)
Common
Intermediate
Representation
LLVM
Optimizer
LLVM
Backend
(ARM)
Machine
Code
(ARM)
LLVM
Backend
(x86)
Machine
Code
(x86)
LLVM
Backend
(PowerPC)
Machine
Code
(PowerPC)
Intermediate representation (IR)
What uses IR?
LLVM compiles IR into native code (x86 assembly)
Compiler front-ends generate IR
You can write your own IR code
Why use IR?
Common representation for many programming languages
Good for optimizations!
C:
IR:
int main() {
int a = 10;
int b = a + 5;
return 0;
}
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 0, i32* %retval
store i32 10, i32* %a, align 4
%0 = load i32* %a, align 4
%add = add nsw i32 %0, 5
store i32 %add, i32* %b, align 4
ret i32 0
}
C:
IR:
int main() {
int a = 10;
int b = a + 5;
return 0;
}
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 0, i32* %retval
store i32 10, i32* %a, align 4
%0 = load i32* %a, align 4
%add = add nsw i32 %0, 5
store i32 %add, i32* %b, align 4
ret i32 0
}
C:
IR:
#include <stdio.h>
@.str = private unnamed_addr
constant [14 x i8] c"Hello
world!\0A\00", align 1
int main() {
printf(Hello
world!\n);
return 42;
}
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval
%call = call i32 (i8*, ...)*
@printf(i8* getelementptr
inbounds ([14 x i8]* @.str,
i32 0, i32 0))
ret i32 42
}
C:
IR:
#include <stdio.h>
@.str = private unnamed_addr
constant [14 x i8] c"Hello
world!\0A\00", align 1
int main() {
printf(Hello
world!\n);
return 42;
}
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval
%call = call i32 (i8*, ...)*
@printf(i8* getelementptr
inbounds ([14 x i8]* @.str,
i32 0, i32 0))
ret i32 42
}
Passes
Common
Intermediate
Representation
Your Code
(C, C++,
Obj-C)
Front End
(Clang)
LLVM
Optimizer
Passes are applied to IR
Back End
(LLVM)
Machine Code
(x86, ARM)
Passes
An LLVM pass is an operation on a piece of IR
Used on modules, functions, instructions, etc.
What can LLVM passes do?
Mutate/transform IR
Compute higher-order information about IR
Used for optimizations and analysis
Can depend on other passes
Transformation Passes
Passes mutate the IR directly
Clean up and canonicalize code from front-end
Optimize code
Examples:
Dead code elimination
Merging functions
Analysis Passes
Passes do not mutate the IR
Calculate information about the IR
Examples:
Instruction count
Alias analysis!
A Simple LLVM Analysis Pass
LLVM is a collection of libraries
Hello is a subclass of FunctionPass. It will operate
on each function in the source file
Create a unique ID for this pass
runOnFunction is an abstract virtual method in
FunctionPass. This is where the analysis code goes.
Initialization of the pass ID. The value does not
matter.
#include "llvm/Pass.h
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
namespace {
struct Hello : public FunctionPass {
static char ID;
Hello() : FunctionPass (ID) {}
bool runOnFunction (Function &F) override
{
errs() << "Hello: " ;
errs().write_escaped (F.getName()) <<
'\n';
return false;
}
};
Register the pass. It can be invoked on the
commandline via opt with hello.
http://llvm.org/docs/WritingAnLLVMPass.html
}
char Hello::ID = 0;
static RegisterPass <Hello> X("hello", "Hello
World Pass" , false, false );
Compiling and Running the Pass
Can be built either in the LLVM source tree (via Makefile) or against compiled
LLVM binaries (via CMake)
Compiles to a shared object
Invoked via the opt command
opt load hello.so hello < someCode.bc > /dev/null
Compiled pass
LLVM IR File to analyze
Pass to run (chosen in registration code)
Drop output, since this is an analysis pass
Lessons Learned Installing LLVM/Clang
Install Clang first, even if you dont plan to use it
CMake expects it and its dependencies to be present when building LLVM
Incompatibilities in Clang and the standard library
You may encounter a strange error: no member named gets in global namespace. Not all versions of GCCs std lib
implementation are compatible with all versions of Clang.
Always use CMake
Makefile support is minimal and documentation is often wrong
Always build from source
Repository binaries often lack dependencies needed when building a pass.