C/C++ Compiler Optimizations
I have often seen obscure code written in a misguided attempt at “hand optimizing.” The irony is that I have also noticed in the make files for the same code that no optimization options were used when compiling the code. What a waste of time; human and computer.
The default level of optimization for most C/C++ compilers is none. The compiler spits out machine code that conforms very closely to the code you have written. This is good for debugging but is not efficient when running the code.
Let’s take a look at some code for a couple of simple functions in C and the assembly language that GCC generates for the ARM processor. Here is the C code:
int adder(int a, int b)
{
int c;
c = a + b;
return c;
}
main(int argc, char* argv[])
{
int a, b, c;
a = atoi(argv[1]);
b = atoi(argv[2]);
c = adder(a, b);
printf("nsum = %dn", c);
}
Now let’s look at the assembler generated for the function adder. The assembly was generated with the option -fverbose-asm. This option tells the compiler to put some comments in the assembly language about what it’s doing. The option is generally only used by compiler developers but it’s also useful for our purpose, trying to understand how optimizations effect the generated code.
adder:
@ args = 0, pretend = 0, frame = 12
@ frame_needed = 1, uses_anonymous_args = 0
mov ip, sp @,
stmfd sp!, {fp, ip, lr, pc} @,
sub fp, ip, #4 @,,
sub sp, sp, #12 @,,
str r0, [fp, #-20] @ a, a
str r1, [fp, #-24] @ b, b
ldr r2, [fp, #-20] @ a, a
ldr r3, [fp, #-24] @ b, b
add r3, r2, r3 @ tmp105, a, b
str r3, [fp, #-16] @ tmp105, c
ldr r3, [fp, #-16] @ D.1727, c
mov r0, r3 @ (result), (result)
sub sp, fp, #12
ldmfd sp, {fp, sp, pc}
The @ symbol begins a comment. The first four lines of code save some state information then create space on the stack for the function’s parameters. In C, local variables and function parameters are traditionally passed on the stack. The compiler makes room for two four byte words on the stack with the instructions:
sub fp, ip, #4
sub sp, sp, #12
What does it do with this space? It takes the two parameters that are already loaded in registers and stores them in memory on the stack; exactly what the C code says to do. Then the same two values are loaded into registers so we can do the real work of the function, add a and b. The result is moved to R0 to pass it back to the caller. Again, the traditional implementation in C is to return results in R0.
Why would the compiler generate code that wastes time moving things in and out of memory like this? It’s actually to make the code easier to debug. You might not have access to the processor’s registers due to the limitations of your debug tools and writing the parameters out to memory might be your only chance to see what is being passed to a function.
So now lets look at what the first level of optimization does to our function (-O1 for gcc).
@ link register save eliminated.
@ lr needed for prologue
add r0, r0, r1 @ (result), a, b
mov pc, lr @
Quite a bit smaller, isn’t it? The first thing the compiler has told us that it doesn’t need to save the link register. Actually the compiler generated code doesn’t save a lot of state information that it did it the previous case. Next the parameters are left in the registers. The compiler was even smart enough to put one of the parameters, a, in R0. Since we are about to clobber R0 to pass back the result, this saves using another register. So the first level of optimization saves about a dozen instructions without doing anything exotic.
At the highest level of optimization, -O3 for gcc, the compiler uses the option -finline-functions-called-once. This tells the compiler to inline any functions that are only used once. So the entire function shrinks to a single line of code in main():
add r4, r4, r0 @ tmp109, a,
This is an important optimization to know about. Often you may have a long function that should be broken up into several functions for readability but you may be reluctant to do this because you don’t want to incur the overhead of several function calls. With the -finline-functions-called-once optimization you don’t have to worry about the function call overhead and you can focus on writing readable code.
So when you’re done debugging your code, don’t forget to turn on the optimization options and recompile. Then test it one last time. Compilers are smart but not infallible.







