Fixed-Point Math
and
Other Optimizations
Embedded Systems 8-1
Fixed Point Math – Why and How
Floating point is too slow and integers truncate the data
– Floating point subroutines: slower than native, overhead of passing arguments, calling
subroutines… simple fixed point routines can be in-lined
Basic Idea: put the radix
3.34375 in a fixed point binary representation
point where covers the
range of numbers you Bit 1 1 0 1 0 1 1
need to represent Weight 21 20 2-1 2-2 2-3 2-4 2-5
I.F Terminology
Weight 2 1 ½ ¼ 1/8 1/16 1/32
– I = number of integer bits
– F= number of fraction bits
Radix Point
Bit Pattern Integer 6.2 1.10
000 0000 0000 0/1 = 0 0/4 = 0 0/1024 = 0
000 0001 1100 28/1 = 28 28/4 = 7 28/1024 = 0.0273…
000 0110 0011 99/1 = 99 99/4 = 24.75 99/1024 = 0.0966…
Radix Point Locations
Embedded Systems 8-2
Rules for Fixed Point Math
Addition, Subtraction
– Radix point stays where it started +
– …so we can treat fixed point numbers like integers
Multiplication
– Radix point moves left by F digits
– … so we need to normalize result afterwards,
shifting the result right by F digits
6.2 Format
10 001010.00
* * 1.25 *000001.01
12.5 00000000 1100.1000
Embedded Systems 8-3
Division
3.1 Format 3.2 Format
Dividend 7 111.0 111.00
÷ Divisor ÷2 ÷ 010.0 ÷ 010.00
Quotient 3 0011 00011
Remainder 1 001.0 001.00
Division
– Quotient is integer, may want to convert back to fixed point by
shifting
– Radix point doesn’t move in remainder
Embedded Systems 8-4
Division, Part II
3.1 Format 3.2 Format
Dividend 7 (7*2)1110.0 (7*4)11100.00
÷ Divisor ÷2 ÷ 010.0 ÷ 010.00
Quotient 3 0011.1 00011.10
Division
– To make quotient have same format as dividend and divisor,
multiply dividend by 2F (shift left by F bits)
– Quotient is in fixed point format now
Embedded Systems 8-5
Example Code for 12.4 Fixed Point Math
Representation
Converting to and from fixed point representation
! ""
#$ %
! &&
Math
'(( )* +*
,-. )* /*
0- )* * &&
( 1 )* %* ""
230 )* 4*
Embedded Systems 8-6
More Fixed Point Math Examples
8.4 Format
10 0000 1010.0000
* + 1.5 +0000 0001.1000
11.5 00000000 1100.1000
4.4 Format
9.0625 1001.0001
* * 6.5 *0110.1000
58.90625 0011 1010.1110 1000
Embedded Systems 8-7
Static Revisited
Static variable
– A local variable which retains its value between function
invocations
– Visible only within its module, so compiler/linker can allocate space
more wisely (recall limited pointer offsets)
Static function
– Visible only within its module, so compiler knows who is calling the
functions,
– Compiler/linker can locate function to optimize calls (short call)
– Compiler can also inline the function to reduce run-time and often
code size
Embedded Systems 8-8
Volatile and Const
Volatile variable
– Value can be changed outside normal program flow
• ISR, variable is actually a hardware register
– Compiler reloads the variable from memory each time it is used
Const variable
– const does not mean constant, but rather read-only
– consts are implemented as real variables (taking space in RAM) or in
ROM, requiring loading operations (often requiring pointer manipulation)
• A #define value can be converted into an immediate operand, which is
much faster
• So avoid them
Const function parameters
– Allow compiler to optimize, as it knows a variable passed as a parameter
hasn’t been changed by the function
Embedded Systems 8-9
More
Const Volatile Variables
– Yes, it’s possible
• Volatile: A memory location that can change unexpectedly
• Const: it is only read by the program
– Example: hardware status register
Embedded Systems 8-10
Starting Points for Efficient Code
Write correct code first, optimize second.
Use a top-down approach.
Know your microprocessors’ architecture, compiler (features
and also object model used), and programming language.
Leave assembly language for unported designs, interrupt
service routines, and frequently used functions.
Embedded Systems 8-11
Floating Point Data Type Specifications
Use the smallest adequate data type, or else…
– Conversions without an FPU are very slow
– Extra space is used
– C standard allows compiler or preprocessor to convert
automatically, slowing down code more
Single-precision (SP) vs. double-precision (DP)
– ANSI/IEEE 754-1985, Standard for Binary Floating Point
Arithmetic
• Single precision: 32 bits
– 1 sign bit, 8 exponent bits, 23 fraction bits
• Double precision
– 1 sign bit, 11 exponent bits, 52 fraction bits
– Single-precision is likely all that is needed
– Use single-precision floating point specifier “f”
Embedded Systems 8-12
Floating-Point Specifier Example
No “f” “f”
Assembler: Assembler:
! "
! " #
$ %
# ##
$ %
Embedded Systems 8-13
Automatic Promotions
Standard math routines usually accept double-precision
inputs and return double-precision outputs.
Again it is likely only single-precision is needed.
– Cast to single-precision if accuracy and overflow conditions are
satisfied
Embedded Systems 8-14
Automatic Promotions Example
Automatic Casting to avoid promotion
! &' ! & '
( (
' ( )( ' ( )(
! ( )
) #* !
) #*
##
( ##
) )
+ ) #* ## + ) #* ##
( #(
Embedded Systems 8-15
Rewriting and Rearranging Expressions
Divides take much longer to execute than multiplies.
Branches taken are usually faster than those not taken.
Repeated evaluation of same expression is a waste of time.
Embedded Systems 8-16
Examples
! & '
is better written as:
!& '
, - . as: , -&.
, - &. as: / - &.;
0 $ -&. , /
0 $/
Embedded Systems 8-17
Algebraic Simplifications and the Laws of Exponents
Original Expression Optimized Expression
a2 – 3a + 2 (a – 1) * (a – 2)
2*, 1-, 1+ 1*, 2-
(a - 1)*(a + 1) a2-1
1*, 1-, 1+ 1*, 1-
1/(1+a/b) b/(b+a)
2/, 1+ 1/, 1+
am * an am+n
2 ^, 1* 1^, 1+
(am)n am*n
2^ 1^, 1*
Embedded Systems 8-18
Literal Definitions
#defines are prone to error
123 4
#
( 123 4"&5
– r = c/2*3.14 is wrong! Evaluates to r = (c/2) * 3.14
#
( 123 4 "&5
– Avoids a divide error
– However, 2*3.14 is loaded as DP leading to extra operations
• convert c to DP
• DP divide
• convert quotient to SP
#
( 123 4 " &5
– Avoids extra operations
Embedded Systems 8-19
The Standard Math Library Revisited
Double precision is likely expected by the standard math
library.
Look for workarounds:
– abs()
) ) / #()
could be written as:
## ( )6
) ) # ()
) ) #
()
Embedded Systems 8-20
Functions: Parameters and Variables
Consider a function which uses a global variable and calls
other functions
– Compiler needs to save the variable before the call, in case it is
used by the called function, reducing performance
– By loading the global into a local variable, it may be promoted into
a register by the register allocator, resulting in better performance
• Can only do this if the called functions don’t use the global
Taking the address of a local variable makes it more difficult
or impossible to promote it into a register
Embedded Systems 8-21
More about Functions
Prototype your functions, or else the compiler may promote
all arguments to ints or doubles!
Group function calls together, since they force global
variables to be written back to memory
Make local functions static, keep in same module (source
file)
– Allows more aggressive function inlining
– Only functions in same module can be inlined
Embedded Systems 8-22