Both true and false: a Zen moment with C

Additional discussion: Hacker News, Reddit

I ran into a really fun bug at work yesterday, where I discovered that my C program was branching down logically inconsistent code paths. After drinking another cup of coffee and firing up GDB I realized that somehow, a boolean variable in my code was simultaneously testing as both true and not true.

While I cannot reproduce the actual source code here, the effect was that code like

bool p;

/* ... */

if ( p )
    puts("p is true");

if ( ! p )
    puts("p is false");

would produce the output:

p is true
p is false

So what’s going on here?

Well it turns out that the authors of the C language specification (and the people who went on to implement compilers for it) were serious about the concept of undefined behavior. In particular, the result of attempting to use an uninitialized variable is undefined.

And in this case that’s exactly what happened: I had failed to properly initialize some memory. Easy, bug fixed. But what I think is interesting are the reasons this code failed in precisely the way that it did. In order to investigate that, we need to get specific.

On 64-bit Linux (Ubuntu 12.04), compiling the following program:

#include <stdio.h>
#include <stdbool.h>

int main(int argc, char *argv[])
{
    volatile bool p;

    if ( p )
        puts("p is true");
    else
        puts("p is not true");

    if ( ! p )
        puts("p is false");
    else
        puts("p is not false");

    return 0;
}

with GCC 4.6.3, using the command line:

$ gcc bool1.c -g0 -O0 -fno-dwarf2-cfi-asm -masm=intel -S -o bool1.asm

produces this (truncated) assembly language:

        .file   "bool1.c"
        .intel_syntax noprefix
        .section        .rodata
.LC0:
        .string "p is true"
.LC1:
        .string "p is not true"
.LC2:
        .string "p is false"
.LC3:
        .string "p is not false"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        push    rbp
.LCFI0:
        mov     rbp, rsp
.LCFI1:
        sub     rsp, 32
.LCFI2:
        mov     DWORD PTR [rbp-20], edi
        mov     QWORD PTR [rbp-32], rsi
        movzx   eax, BYTE PTR [rbp-1]
        test    al, al
        je      .L2
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        jmp     .L3
.L2:
        mov     edi, OFFSET FLAT:.LC1
        call    puts
.L3:
        movzx   eax, BYTE PTR [rbp-1]
        xor     eax, 1
        test    al, al
        je      .L4
        mov     edi, OFFSET FLAT:.LC2
        call    puts
        jmp     .L5
.L4:
        mov     edi, OFFSET FLAT:.LC3
        call    puts
.L5:
        mov     eax, 0
        leave
.LCFI3:
        ret

To perform the test if ( p ) here, first the stack variable is loaded into a 32-bit register with movzx eax, BYTE PTR [rbp-1], and then we use the instruction test al, al which sets the zero flag (ZF) if the lower eight bits of this value are zero. Next we execute the conditional jump je .L2, which jumps to print “p is not true” if ZF was set; otherwise we don’t jump, and “p is true” gets printed instead.

Next let’s examine the second test, if ( ! p ), at label .L3. This starts out the same by loading the boolean variable into register eax, but notice how the negation is handled. Rather than reorder the jumps or use jne instead of je, the compiler explicitly negates the boolean by performing a bitwise exclusive-or: xor eax, 1.

Normally this would be fine — a bool variable is only supposed to contain a value of zero or one, in which case its value can be negated by XOR with 1. When you cast to a bool at runtime, the compiler generates code to ensure only one or zero gets stored. For instance, the cast in this program:

#include <stdbool.h>

volatile char c = 0xff;
volatile bool p;

int main(int argc, char* argv[])
{
    p = (bool)c;
    return 0;
}

is implemented as the following four instructions:

movzx   eax, BYTE PTR c[rip]
test    al, al
setne   al
mov     BYTE PTR p[rip], al

wherein setne sets the register al to exactly 1 if the char contained any nonzero value, before saving the register’s 8-bit value to the boolean variable.

But the compiler affords us no such protection if we accidentally use an uninitialized value as a boolean. It doesn’t have to, it’s not the compiler’s responsibility; the result of using an uninitialized stack variable is undefined. And so if we somehow wind up with a value of e.g. 0x60 stored at the address of a bool variable (as I saw during my troubleshooting yesterday), both the variable and its negation (via exclusive or with 1) will be nonzero, and therefore test as true.

Interestingly, enabling optimization (-O2) in GCC causes the compiler to factor out the XOR and instead reorder the jumps, meaning this program actually behaves more robustly under compiler optimization (for certain definitions of “robust” anyway):

        .file   "bool1.c"
        .intel_syntax noprefix
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "p is true"
.LC1:
        .string "p is not true"
.LC2:
        .string "p is false"
.LC3:
        .string "p is not false"
        .section        .text.startup,"ax",@progbits
        .p2align 4,,15
        .globl  main
        .type   main, @function
main:
.LFB22:
        sub     rsp, 24
.LCFI0:
        movzx   eax, BYTE PTR [rsp+15]
        test    al, al
        je      .L2
        mov     edi, OFFSET FLAT:.LC0
        call    puts
.L3:
        movzx   eax, BYTE PTR [rsp+15]
        test    al, al
        je      .L7
        mov     edi, OFFSET FLAT:.LC3
        call    puts
.L5:
        xor     eax, eax
        add     rsp, 24
.LCFI1:
        ret

And of course when we compare char, int, or other multiple-bit values for truthiness, the compiler makes no such assumption that the value can be logically negated by bitwise XOR; instead it uses jne in place of the je instruction. (Maybe someone with more knowledge of the compiler can say why GCC with -O0 uses an xor at all, when testing the negation of a bool.)