Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit a29f0dd

Browse files
authored
[llubi] Add initial support for llubi (#180022)
This patch implements the initial support for upstreaming [llubi](https://github.com/dtcxzyw/llvm-ub-aware-interpreter). It only provides the minimal functionality to run a simple main function. I hope we can focus on the interface design in this PR, rather than trivial implementations for each instruction. RFC link: https://discourse.llvm.org/t/rfc-upstreaming-llvm-ub-aware-interpreter/89645 Excluding the driver `llubi.cpp`, this patch contains three components for better decoupling: + `Value.h/cpp`: Value representation + `Context.h/cpp`: Global state management (e.g., memory) and interpreter configuration + `Interpreter.cpp`: The main interpreter loop Compared to the out-of-tree version, the major differences are listed below: + The interpreter logic always returns the control to its caller, i.e., it never calls `exit/abort` when immediate UBs are triggered. + `EventHandler` provides an interface to dump the trace. It also allows callers to inspect the actual value and verify the correctness of analysis passes (e.g, KnownBits/SCEV). + The context is designed to be reentrant. That is, you can call `runFunction` multiple times. But its usefulness remains in doubt due to side effects made by previous calls. + `runFunction` handles function calls with a loop, instead of calling itself recursively. This makes it no longer bounded by the stack depth. + Uninitialized memory is planned to be approximated by returning random values each time an uninitialized byte is loaded.
1 parent 77cb666 commit a29f0dd

15 files changed

Lines changed: 1279 additions & 0 deletions

File tree

llvm/docs/CommandGuide/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Basic Commands
1717
dsymutil
1818
llc
1919
lli
20+
llubi
2021
llvm-as
2122
llvm-cgdata
2223
llvm-config

llvm/docs/CommandGuide/llubi.rst

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
llubi - LLVM UB-aware Interpreter
2+
=================================
3+
4+
.. program:: llubi
5+
6+
SYNOPSIS
7+
--------
8+
9+
:program:`llubi` [*options*] [*filename*] [*program args*]
10+
11+
DESCRIPTION
12+
-----------
13+
14+
:program:`llubi` directly executes programs in LLVM bitcode format and tracks values in LLVM IR semantics.
15+
Unlike :program:`lli`, :program:`llubi` is designed to be aware of undefined behaviors during execution.
16+
It detects immediate undefined behaviors such as integer division by zero, and respects poison generating flags
17+
like `nsw` and `nuw`. As it captures most of the guardable undefined behaviors, it is highly suitable for
18+
constructing an interesting-ness test for miscompilation bugs.
19+
20+
If `filename` is not specified, then :program:`llubi` reads the LLVM bitcode for the
21+
program from standard input.
22+
23+
The optional *args* specified on the command line are passed to the program as
24+
arguments.
25+
26+
GENERAL OPTIONS
27+
---------------
28+
29+
.. option:: -fake-argv0=executable
30+
31+
Override the ``argv[0]`` value passed into the executing program.
32+
33+
.. option:: -entry-function=function
34+
35+
Specify the name of the function to execute as the program's entry point.
36+
By default, :program:`llubi` uses the function named ``main``.
37+
38+
.. option:: -help
39+
40+
Print a summary of command line options.
41+
42+
.. option:: -verbose
43+
44+
Print results for each instruction executed.
45+
46+
.. option:: -version
47+
48+
Print out the version of :program:`llubi` and exit without doing anything else.
49+
50+
INTERPRETER OPTIONS
51+
-------------------
52+
53+
.. option:: -max-mem=N
54+
55+
Limit the amount of memory (in bytes) that can be allocated by the program, including
56+
stack, heap, and global variables. If the limit is exceeded, execution will be terminated.
57+
By default, there is no limit (N = 0).
58+
59+
.. option:: -max-stack-depth=N
60+
61+
Limit the maximum stack depth to N. If the limit is exceeded, execution will be terminated.
62+
The default limit is 256. Set N to 0 to disable the limit.
63+
64+
.. option:: -max-steps=N
65+
66+
Limit the number of instructions executed to N. If the limit is reached, execution will
67+
be terminated. By default, there is no limit (N = 0).
68+
69+
.. option:: -vscale=N
70+
71+
Set the value of `llvm.vscale` to N. The default value is 4.
72+
73+
EXIT STATUS
74+
-----------
75+
76+
If :program:`llubi` fails to load the program, or an error occurs during execution (e.g, an immediate undefined
77+
behavior is triggered), it will exit with an exit code of 1.
78+
If the return type of entry function is not an integer type, it will return 0.
79+
Otherwise, it will return the exit code of the program.

llvm/test/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ set(LLVM_TEST_DEPENDS
7676
llc
7777
lli
7878
lli-child-target
79+
llubi
7980
llvm-addr2line
8081
llvm-ar
8182
llvm-as

llvm/test/lit.cfg.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,7 @@ def get_asan_rtlib():
235235
"dsymutil",
236236
"lli",
237237
"lli-child-target",
238+
"llubi",
238239
"llvm-ar",
239240
"llvm-as",
240241
"llvm-addr2line",

llvm/test/tools/llubi/main.ll

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
; RUN: llubi --verbose < %s 2>&1 | FileCheck %s
2+
3+
define i32 @main(i32 %argc, ptr %argv) {
4+
ret i32 0
5+
}
6+
7+
; CHECK: Entering function: main
8+
; CHECK: i32 %argc = i32 1
9+
; CHECK: ptr %argv = ptr 0x8 [argv]
10+
; CHECK: ret i32 0
11+
; CHECK: Exiting function: main

llvm/test/tools/llubi/main2.ll

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
; RUN: llubi --verbose < %s 2>&1 | FileCheck %s
2+
3+
define i32 @main() {
4+
ret i32 0
5+
}
6+
7+
; CHECK: Entering function: main
8+
; CHECK: ret i32 0
9+
; CHECK: Exiting function: main

llvm/test/tools/llubi/poison.ll

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
; RUN: not llubi --verbose < %s 2>&1 | FileCheck %s
2+
3+
define i32 @main(i32 %argc, ptr %argv) {
4+
ret i32 poison
5+
}
6+
; CHECK: Entering function: main
7+
; CHECK: i32 %argc = i32 1
8+
; CHECK: ptr %argv = ptr 0x8 [argv]
9+
; CHECK: ret i32 poison
10+
; CHECK: Exiting function: main
11+
; CHECK: error: Execution of function 'main' resulted in poison return value.

llvm/tools/llubi/CMakeLists.txt

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
set(LLVM_LINK_COMPONENTS
2+
Analysis
3+
Core
4+
IRPrinter
5+
IRReader
6+
Support
7+
)
8+
9+
add_llvm_tool(llubi
10+
llubi.cpp
11+
12+
DEPENDS
13+
intrinsics_gen
14+
)
15+
16+
add_subdirectory(lib)
17+
target_link_libraries(llubi PRIVATE LLVMUBAwareInterpreter)
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
set(LLVM_LINK_COMPONENTS
2+
Analysis
3+
Core
4+
Support
5+
)
6+
7+
add_llvm_library(LLVMUBAwareInterpreter
8+
STATIC
9+
Context.cpp
10+
Interpreter.cpp
11+
Value.cpp
12+
)

llvm/tools/llubi/lib/Context.cpp

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
//===- Context.cpp - State Tracking for llubi -----------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
//
9+
// This file tracks the global states (e.g., memory) of the interpreter.
10+
//
11+
//===----------------------------------------------------------------------===//
12+
13+
#include "Context.h"
14+
#include "llvm/Support/MathExtras.h"
15+
16+
namespace llvm::ubi {
17+
18+
Context::Context(Module &M)
19+
: Ctx(M.getContext()), M(M), DL(M.getDataLayout()),
20+
TLIImpl(M.getTargetTriple()) {}
21+
22+
Context::~Context() = default;
23+
24+
AnyValue Context::getConstantValueImpl(Constant *C) {
25+
if (isa<PoisonValue>(C))
26+
return AnyValue::getPoisonValue(*this, C->getType());
27+
28+
// TODO: Handle ConstantInt vector.
29+
if (auto *CI = dyn_cast<ConstantInt>(C))
30+
return CI->getValue();
31+
32+
llvm_unreachable("Unrecognized constant");
33+
}
34+
35+
const AnyValue &Context::getConstantValue(Constant *C) {
36+
auto It = ConstCache.find(C);
37+
if (It != ConstCache.end())
38+
return It->second;
39+
40+
return ConstCache.emplace(C, getConstantValueImpl(C)).first->second;
41+
}
42+
43+
MemoryObject::~MemoryObject() = default;
44+
MemoryObject::MemoryObject(uint64_t Addr, uint64_t Size, StringRef Name,
45+
unsigned AS, MemInitKind InitKind)
46+
: Address(Addr), Size(Size), Name(Name), AS(AS),
47+
State(InitKind != MemInitKind::Poisoned ? MemoryObjectState::Alive
48+
: MemoryObjectState::Dead) {
49+
switch (InitKind) {
50+
case MemInitKind::Zeroed:
51+
Bytes.resize(Size, Byte{0, ByteKind::Concrete});
52+
break;
53+
case MemInitKind::Uninitialized:
54+
Bytes.resize(Size, Byte{0, ByteKind::Undef});
55+
break;
56+
case MemInitKind::Poisoned:
57+
Bytes.resize(Size, Byte{0, ByteKind::Poison});
58+
break;
59+
}
60+
}
61+
62+
IntrusiveRefCntPtr<MemoryObject> Context::allocate(uint64_t Size,
63+
uint64_t Align,
64+
StringRef Name, unsigned AS,
65+
MemInitKind InitKind) {
66+
if (MaxMem != 0 && SaturatingAdd(UsedMem, Size) >= MaxMem)
67+
return nullptr;
68+
uint64_t AlignedAddr = alignTo(AllocationBase, Align);
69+
auto MemObj =
70+
makeIntrusiveRefCnt<MemoryObject>(AlignedAddr, Size, Name, AS, InitKind);
71+
MemoryObjects[AlignedAddr] = MemObj;
72+
AllocationBase = AlignedAddr + Size;
73+
UsedMem += Size;
74+
return MemObj;
75+
}
76+
77+
bool Context::free(uint64_t Address) {
78+
auto It = MemoryObjects.find(Address);
79+
if (It == MemoryObjects.end())
80+
return false;
81+
UsedMem -= It->second->getSize();
82+
It->second->markAsFreed();
83+
MemoryObjects.erase(It);
84+
return true;
85+
}
86+
87+
Pointer Context::deriveFromMemoryObject(IntrusiveRefCntPtr<MemoryObject> Obj) {
88+
assert(Obj && "Cannot determine the address space of a null memory object");
89+
return Pointer(
90+
Obj,
91+
APInt(DL.getPointerSizeInBits(Obj->getAddressSpace()), Obj->getAddress()),
92+
/*Offset=*/0);
93+
}
94+
95+
void MemoryObject::markAsFreed() {
96+
State = MemoryObjectState::Freed;
97+
Bytes.clear();
98+
}
99+
100+
void MemoryObject::writeRawBytes(uint64_t Offset, const void *Data,
101+
uint64_t Length) {
102+
assert(SaturatingAdd(Offset, Length) <= Size && "Write out of bounds");
103+
const uint8_t *ByteData = static_cast<const uint8_t *>(Data);
104+
for (uint64_t I = 0; I < Length; ++I)
105+
Bytes[Offset + I].set(ByteData[I]);
106+
}
107+
108+
void MemoryObject::writeInteger(uint64_t Offset, const APInt &Int,
109+
const DataLayout &DL) {
110+
uint64_t BitWidth = Int.getBitWidth();
111+
uint64_t IntSize = divideCeil(BitWidth, 8);
112+
assert(SaturatingAdd(Offset, IntSize) <= Size && "Write out of bounds");
113+
for (uint64_t I = 0; I < IntSize; ++I) {
114+
uint64_t ByteIndex = DL.isLittleEndian() ? I : (IntSize - 1 - I);
115+
uint64_t Bits = std::min(BitWidth - ByteIndex * 8, uint64_t(8));
116+
Bytes[Offset + I].set(Int.extractBitsAsZExtValue(Bits, ByteIndex * 8));
117+
}
118+
}
119+
void MemoryObject::writeFloat(uint64_t Offset, const APFloat &Float,
120+
const DataLayout &DL) {
121+
writeInteger(Offset, Float.bitcastToAPInt(), DL);
122+
}
123+
void MemoryObject::writePointer(uint64_t Offset, const Pointer &Ptr,
124+
const DataLayout &DL) {
125+
writeInteger(Offset, Ptr.address(), DL);
126+
// TODO: provenance
127+
}
128+
129+
} // namespace llvm::ubi

0 commit comments

Comments
 (0)