Debugging
Issa Aboudi
03/19/23
Lecture-12, Lecture-13, Lecture-14,
#UCLA #Spring2023 #ANotes #CS35L
More than just GDB
Alternative ways to debug than using debugger
• print statements
• time shell command
• op tools like:
– ps -ef (everything running on system)
– ps -ejft (in a tree)
– top (list processes that are top consumers)
– kill (kill off processes)
• strace (watch all system calls and log them)
• valgrind
Using compiler to debug
-f is an option in gcc that tells it to generate certain code and optimizations for different things.
Like if we do -fstack-protector it’ll generate code to protect against stack-based buffer overflow
attacks. Something like -fprofile-generate will generate code to collect profiling information for
later optimization.
Other commonly used -f options in gcc include:
• -finline-functions: attempts to inline small functions for better performance.
• -fomit-frame-pointer: omits the frame pointer in function calls to save space on the stack.
• -fPIC: generates position-independent code, which can be loaded anywhere in memory.
• -fno-strict-aliasing: disables strict aliasing optimization, which can cause issues with certain
types of pointer casting.
• -fsanitize=address: enables address sanitizer, which detects memory errors at runtime.
There are many more -f options available in gcc, each with its own specific purpose and use case. It’s
important to carefully consider which options are appropriate for your code and environment to ensure
optimal performance and security.
Optimization levels
We can use the -O flag to set optimization levels. Optimize in compilers/GCC does not mean “find
best/fastest solution” It means better code.
1
We have tradeoff between better code (faster executable) or quick compilation. But harder to debug
the more optimized. Could also do gcc -O0 to get no optimized machine code instructions
Link time optimization
gcc -flto - link time optimization. This basically means put into the .o file what you would normally
would, but also a copy of the source code. .o file also contains a copy source code.
Now when you run gcc -flto *.o gcc can see all source code and optimize all the files better - like
if it sees a function call, it can just substitute that with the instruction for function call.
This can result in better performance and smaller executable files, but it also increases compilation
time and requires more memory during linking.
Other ways to optimize
A lot of this stuff is gcc specific but other compilers have similar things:
Unreachable
__builtin_unreachable() is a compiler builtin function in GCC and Clang which informs the com-
piler that the code execution will never reach the point where the function is called. It is a hint for
the compiler to optimize the code by removing any unnecessary code that follows this function call.
The function has no arguments and no return value. It is typically used to mark a point in code
where an error condition occurs and the program should terminate immediately. For example, if an
assertion fails, __builtin_unreachable() can be used to indicate that the program should never
continue executing beyond that point.
The function can also be used to silence warnings about unreachable code, as it tells the compiler that
the code is intentionally unreachable and should not generate any warnings.
Here’s an example of how __builtin_unreachable() can be used:
void my_function(int x) {
if (x < 0) {
// Handle error condition
__builtin_unreachable(); // Indicate that this code is unreachable
}
// Normal processing
}
In this example, if x is negative, the code inside the if statement will execute and then immediately
terminate using __builtin_unreachable(). The function call tells the compiler that execution will
never reach any code beyond this point.
In CS23 there’ll be a unreachable() function that does basically the above.
Attributes
GCC has extension to C language that let you give the compiler more information. This information
can be ignored but gives performance or correctness advice.
If on non GCC compiler just add the following to your header
#ifndef __GNUC__
#define __attribute(x)__
#endif
2
One way to use attribute is to make sure address of an array is aligned (multiple of 8): char bto[1000]
= __attribute__((aligned(8))); - idea is that since RAM is divided into cache boundaries, making
sure all your data can be pulled in one cache slice will benefit your program because it means object
won’t be crossing cache boundaries and require a fetch from HD/SSD
Another attribute is void foo(void) __attribute__((cold)); - this tells the compiler that the
function is rarely called (warm = often called, cold rarely called).
We have to mark these functions ourself, but that’s tedious. Lets profile our program so we can figure
out which functions get called how often.
gcc --coverage - will come up with a “temperature graph” of function calls. Profiles tend to be input
dependent so we need good test cases that test performance and tests how the program scales.
Once we’ve done that, we can tell gcc another flag that looks at the profiles and automatically insert
hot/cold on the appropriate functions.
Static Checking
static_assert(0 < n); - check expression at compile time. As long as constant expression.
The reason they’re useful is in larger complicated programs and you want to document assumptions
you’re making in the code. For example you might do something like:
static_assert(INT_MAX < LONG_MAX)
Code might assume that long are bigger than ints. If that’s an assumption you want to make, you
write this in the code. That way if someone running a machine where INT_MAX and LONG_MAX
are = (which is realistic) and it’ll tell them woah this program operates under that assumption.
C also has assert from <assert.h> - Assert does check at runtime and aborts if condition is false.
GCC Flags
gcc -W - enables or disables warnings. Hundreds of kinds of warnings. We care about some arguments
to this: -Wall doesn’t even mean all warnings. It means enable all warning that gcc developers think
you want (all “useful” warnings). They’re helpful warning and seasnet might turn that on for you
already. Good to know.
Example warning:
/*a bad
exit(27)
/* comment */
This is a bad comment because all code within that comment is ignored. Compiler will warn you about
these mistakes. Special warning for this: -Wcomment (implied in Wall)
Another warning - -Wparentheses regarding:
return a<<b+c;
People don’t always know + has higher precedence than shift left. So inserts parenthesis so that people
aren’t confused.
-Waddress warns about addresses of objects that are probably wrong
char* p = f(x);
if (p == "abc")
return 27;
3
The == will always return false because we don’t know the value of p until function returns.
-Wstrict-aliasing - controversial in linux kernel. Suppose we have long value l
long l = -27;
int* p = (int*)&l;
*p = 0;
return l;
We’re taking a pointer to a long, and treating it as a pointer to an integer. Then overwriting half of
the long with the value 0. Now we have a weird 232 #. C/C++ compiler will cache l value in the
register and return an overwritten versiion of it because of *p on some compilers - so 232 or -27. Both
are correct from the compiler’s perspective because you’re doing problematic operations and both are
valid answers.
Controversial because Linus Torvalds loved to do this because it makes kernel go faster.
-Wmaybe-uninitialized - Here the idea is that the compiler is going to warn you when you access a
variable that is uninitialized.
Program will reliably crash if we encounter undefined behavior. GCC has a better way of doing this.
gcc -fsanitize=undefined (sanitize any undefined behavior. Put in runtime check everytime I add
to an integer, and if resulting value overflows then crash). Don’t have to modify your code.
C standard and C++ standard have lots of undefined behaviors defined. There’s some undefined
behaviors that happen so often that there are separate flags.
gcc -fsanitize=address - bad pointer or accessing bad addresses. These flags make programs slow
and hurt performance. Don’t want to use all the time but good for debugging. If program crashes
when you do this, you can look for it while debugging.
gcc -fsanitize=leak - For memory leaks where you allocate memory but forget to free it.
gcc -fsanitize=thread - This is the fanciest form of sanitization, looks for race conditions between
two threads. It’ll try to catch race conditions.
Common sense advice probably already know:
• Don’t guess at random where the bug is. Guess non randomly.
1) Stabilize the failure. Sometimes you run into sittuation where bug occurs or doesn’t. Need to
find a test case where the program reliably fails - or bugs out. Come up with test case that
always makes program fails. Might not be so obvious
2) Locate the failure source. Look at all the instructions leading up to the crash. Figure out why
the crash occured. You want a point of failure
3) Put in print statements to find out more about the failure. Get more details for why that
happened. Or you can use gdb on the program (unmodified code). Should be noted that its
helpful if you compile with gcc -g option. This puts in information in the executable for the
debugger. Machine code executable runs the same way, but it helps with the debugging process.
Like put in the name of local variables. Does make executable bigger but other htan that has
nothing.