Detecting Integer Overflows — A Hardware Approach

· Nov 24, 2016 ·

There are few bugs in programming as pernicious as the integer overflow. It can be difficult to detect your program has an integer overflow issue, many times it will silently corrupt your memory and possibly open you up to a buffer overflow. If you get lucky, your program will terminate shortly after it overflows with a segmentation fault. So how can we tell when an integer overflow has occurred? It turns out the hardware can help.

Before we can find a solution, it is important to look at how an overflow manifests differently between unsigned and signed integers.

TL;DR - We can use the RFLAGS register on x86-64 processors to detect if we have an overflow, but compiler optimizations can make it difficult to use.

8-Bit Unsigned Integer Overflow 🔗

Unsigned integers store only the absolute value of a number and not the sign. An 8-bit unsigned integer would thus be able to represent the values 0 to 255.

When we add two numbers in decimal like, 5 + 7, we know that we write the 2 in the one’s column and carry the 1 to the ten’s column to get 12. The same carry operation happens in binary – but what if we need 9 bits to store the result of adding two 8-bit numbers?

  11111111 (255)
+ 00000010   (2)
----------
1|00000001   (1)

It turns out the most significant 1 is simply truncated, so the resulting value is 1 and not the 257 we were expecting. More broadly, the result of an unsigned integer will be result % (2^n), where n is the size of the original integer in bits. This overflow behavior is defined in the C specification so you can rely on getting this result.

8-Bit Signed Integer Overflow 🔗

Most modern computer architectures represent signed integers with two’s complement. This means the value ranges for a signed 8-bit integer are -128 to 127.

Two’s complement: if the number is positive represent it in binary as we normally would. If it is negative, flip the bits and add 1.

Let’s take a look at an integer overflow in action by adding two 8-bit signed integers, in this case holding the values 127 and 1.

  01111111  (127)
+ 00000001    (1)
----------
  10000000 (-128)

As you can guess, we were expecting positive 128, not negative 128. The same sign flip would happen in reverse - if we subtracted 1 from -128 we would get positive 127.

It should be noted this is an undefined behavior in C. While it is most likely the result you will get, it is not guaranteed and can change in the future. The reason this behavior is undefined is because the C standard does not require signed arithmetic to use two’s complement. A different representation of signed integers may overflow differently.

Thus we have seen two different types of integer overflows: unsigned integers get truncated, and signed integers ‘wrap around’.

A Software Approach 🔗

Before diving into the hardware based solution we should give a nod to the software solutions. There are quite a few Stack Overflow posts detailing methods for detecting overflows, from checking for the same result when the value is saved to a larger data type, to the typical bit twiddling techniques that are common in C.

A better choice would be to stick with using the builtin compiler functions. Both GCC and Clang provide fast, accurate, and portable functions that test for overflows.

The only wrinkle with the builtin solutions is they require different function calls depending on the type of mathematics operation being performed. If we want a more generic solution we have to look to the hardware for help. As it turns out the CPU detects both types of integer overflows automatically – we just need to know where to look.

Let The Hardware Do The Work 🔗

While the 16 general purpose registers on the x86-64 platform (e.g. eax, edx, …) get most of the attention, there are additional registers dedicated to supporting SIMD operations, holding the next instruction to be executed, and the status of the CPU. This latter register, called RFLAGS on 64-bit systems(x86-64) and ELFAGS on 32-bit systems(x86), will help us detect overflows.

We are interested in just two bits in the 64-bit RFLAGS register. The first, at offset zero, is the carry flag – this flag is set to 1 when the CPU encounters an overflow (carry) condition while performing a mathematical operation on unsigned integers and set to 0 if a carry condition was not encountered.

The second bit we are interested in is located at bit offset 11 which contains the overflow flag. It operates like the carry flag but represents an overflow condition for signed integers.

Below we can see a layout of the RFLAGS register; bits 32-63 are not shown. Again, we are interested in bits 0 and 11.

Reading RFLAGS in C 🔗

Unfortunately there is no function in C to access the RFLAGS register directly, so we will need to use a bit of simple inline assembly to read the status flags and determine whether the CPU encountered an overflow.

inline int overflowed() {
  uint64_t rflags;       // where we will store the RFLAGS value

  __asm__ (   
    "pushfq;"            // pushes RFLAGS onto the stack
    "popq %%rax;"        // pops the stack into the rax register
    "movq %%rax, %0;"    // copies rax into our RFLAGS variable
    : "=r" (rflags)      // output variables
    :                    // input variables
    : "%rax"             // clobbered/modified registers
  );

  return rflags & 0x801; // test for the overflow and carry bits
}

Now we can simply call the overflowed() function after we perform arithmetic on two integers to check for either an overflow or a carry. If the previous arithmetic operation overflowed or had a carry this function will return true.

#include <stdint.h>
#include <stdio.h>

#include "overflow.h"

int main() {
  int64_t max = INT64_MAX;  // defined in stdint.h
  int64_t result = 0;

  result = max + 1;

  if (overflowed()) {
    printf("\nPass - Integer Overflow");
  } else {
    printf("\nFail - Integer Overflow");
  }

  result = 1 + 1;

  if (!overflowed()) {
    printf("\nPass - No Integer Overflow");
  } else {
    printf("\nFail - No Integer Overflow");
  }

  return 0;
}

Running this program with the standard compile options should give us the output we expect.

Pass - Integer Overflow
Pass - No Integer Overflow

However if we allow the compiler to perform any level of optimization, by adding the optimization flag -O the function no longer works correctly.

Fail - Integer Overflow
Pass - No Integer Overflow

While this may be puzzling at first, the reason is fairly intuitive. This code only works if the CPU actually performs the calculation. The compiler realized the value stored in result was never used and could thus eliminate that dead code that would have caused an overflow.

Additionally the compiler might not even use the standard arithmetic instructions. For example, multiplying by factors of two can be performed by using the lea instruction which does not modify the EFLAGS register.

Those aren’t the only optimization pitfalls. Even if we call the overflowed() function from a piece of code that uses the value of an arithmetic operation we aren’t guaranteed an expected result either.

int square(int num) {
    int x = num * num;

    if (overflowed()) {
        return 0;
    } else {
        return x;
    }
}

This code seems perfectly reasonable and should force the compiler to perform the multiplication - but once again, the compiler outsmarts us.

_square:
  push       rbp
  mov        rbp, rsp
  pushfq
  pop        rax
  mov        rcx, rax  // reads from EFLAGS
  imul       edi, edi  // squares the variable x
  xor        eax, eax  // zeros out our return value
  test       cx, 0x801 // checks the value in EFLAGS
  cmove      eax, edi  // saves x to be returned
  pop        rbp
  ret

The compiler decides to inline and break apart our overflowed() function. First it reads from EFLAGS, then performs the multiplication, and finally tests to see if the overflow value was set. By moving the multiplication after checking EFLAGS, we wont actually know if there was an overflow. The compiler doesn’t understand our custom assembly relates to the arithmetic operation and thus thinks it is a safe change to make.

Gotchas 🔗

Any SIMD optimizations that may have been possible will no longer be used.
Arithmetic operations performed on types smaller than an int will cause that type to be promoted to a similar type as large as an int. Thus if you are adding two signed 8-bit numbers together, they will automatically be resized to the size of int, which will not trigger the normal overflow check. The size of an int is variable in C, but it must be at least 16-bits. Typically it is 32-bits in length, even if compiled on a 64-bit system.

Coda 🔗

+ Same function call independent of the arithmetic operation being performed, size of data, or type of integer.
- Not architecture independent, i.e. this code wont directly work on ARM platforms.
- Disables SIMD optimizations and enabling other optimizations can cause unexpected/wrong results.

The bottom line is simple - while using custom assembly to read RFLAGS is a fun exercise, stick with the compiler builtins. The code is available {% include file.html name=“overflow.h” text=“here” %}.

Updates 🔗

November 29, 2016 - Added more warnings about issues with optimizations turned on.
December 8, 2016 - Added disassembly showing optimization issues.

Citations 🔗

Geeks For Geeks: Integer Promotions in C - Accessed November, 2016
Stack Overflow: Causing a divide overflow error (x86) - Accessed November, 2016
Stack Overflow: Why is unsigned integer overflow defined behavior but signed integer overflow isn’t? - Accessed November, 2016

programming