mark shroyer, dot com: this is where I keep my things

Site Navigation


Both true and false: a Zen moment with C

June 27, 2012 – 7:15 am

Update: Discussion on Hacker News

I ran into a really fun bug at work yesterday, where I discovered that my C program was branching down logically inconsistent code paths. After drinking another cup of coffee and firing up GDB I realized that somehow, a boolean variable in my code was simultaneously testing as both true and not true.

While I cannot reproduce the actual source code here, the effect was that code like

bool p;
 
/* ... */
 
if ( p )
    puts("p is true");
 
if ( ! p )
    puts("p is false");

would produce the output:

p is true
p is false

So what’s going on here?

Well it turns out that the authors of the C language specification (and the people who went on to implement compilers for it) were serious about the concept of undefined behavior. In particular, the result of attempting to use an uninitialized variable is undefined.

And in this case that’s exactly what happened: I had failed to properly initialize some memory. Easy, bug fixed. But what I think is interesting are the reasons this code failed in precisely the way that it did. In order to investigate that, we need to get specific.

On 64-bit Linux (Ubuntu 12.04), compiling the following program:

#include <stdio.h>
#include <stdbool.h>
 
int main(int argc, char *argv[])
{
    volatile bool p;
 
    if ( p )
        puts("p is true");
    else
        puts("p is not true");
 
    if ( ! p )
        puts("p is false");
    else
        puts("p is not false");
 
    return 0;
}

with GCC 4.6.3, using the command line:

$ gcc bool1.c -g0 -O0 -fno-dwarf2-cfi-asm -masm=intel -S -o bool1.asm

produces this (truncated) assembly language:

        .file   "bool1.c"
        .intel_syntax noprefix
        .section        .rodata
.LC0:
        .string "p is true"
.LC1:
        .string "p is not true"
.LC2:
        .string "p is false"
.LC3:
        .string "p is not false"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        push    rbp
.LCFI0:
        mov     rbp, rsp
.LCFI1:
        sub     rsp, 32
.LCFI2:
        mov     DWORD PTR [rbp-20], edi
        mov     QWORD PTR [rbp-32], rsi
        movzx   eax, BYTE PTR [rbp-1]
        test    al, al
        je      .L2
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        jmp     .L3
.L2:
        mov     edi, OFFSET FLAT:.LC1
        call    puts
.L3:
        movzx   eax, BYTE PTR [rbp-1]
        xor     eax, 1
        test    al, al
        je      .L4
        mov     edi, OFFSET FLAT:.LC2
        call    puts
        jmp     .L5
.L4:
        mov     edi, OFFSET FLAT:.LC3
        call    puts
.L5:
        mov     eax, 0
        leave
.LCFI3:
        ret

To perform the test if ( p ) here, first the stack variable is loaded into a 32-bit register with movzx eax, BYTE PTR [rbp-1], and then we use the instruction test al, al which sets the zero flag (ZF) if the lower eight bits of this value are zero. Next we execute the conditional jump je .L2, which jumps to print “p is not true” if ZF was set; otherwise we don’t jump, and “p is true” gets printed instead.

Next let’s examine the second test, if ( ! p ), at label .L3. This starts out the same by loading the boolean variable into register eax, but notice how the negation is handled. Rather than reorder the jumps or use jne instead of je, the compiler explicitly negates the boolean by performing a bitwise exclusive-or: xor eax, 1.

Normally this would be fine — a bool variable is only supposed to contain a value of zero or one, in which case its value can be negated by XOR with 1. When you cast to a bool at runtime, the compiler generates code to ensure only one or zero gets stored. For instance, the cast in this program:

#include <stdbool.h>
 
volatile char c = 0xff;
volatile bool p;
 
int main(int argc, char* argv[])
{
    p = (bool)c;
    return 0;
}

is implemented as the following four instructions:

        movzx   eax, BYTE PTR c[rip]
        test    al, al
        setne   al
        mov     BYTE PTR p[rip], al

wherein setne sets the register al to exactly 1 if the char contained any nonzero value, before saving the register’s 8-bit value to the boolean variable.

But the compiler affords us no such protection if we accidentally use an uninitialized value as a boolean. It doesn’t have to, it’s not the compiler’s responsibility; the result of using an uninitialized stack variable is undefined. And so if we somehow wind up with a value of e.g. 0x60 stored at the address of a bool variable (as I saw during my troubleshooting yesterday), both the variable and its negation (via exclusive or with 1) will be nonzero, and therefore test as true.

Interestingly, enabling optimization (-O2) in GCC causes the compiler to factor out the XOR and instead reorder the jumps, meaning this program actually behaves more robustly under compiler optimization (for certain definitions of “robust” anyway):

        .file   "bool1.c"
        .intel_syntax noprefix
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "p is true"
.LC1:
        .string "p is not true"
.LC2:
        .string "p is false"
.LC3:
        .string "p is not false"
        .section        .text.startup,"ax",@progbits
        .p2align 4,,15
        .globl  main
        .type   main, @function
main:
.LFB22:
        sub     rsp, 24
.LCFI0:
        movzx   eax, BYTE PTR [rsp+15]
        test    al, al
        je      .L2
        mov     edi, OFFSET FLAT:.LC0
        call    puts
.L3:
        movzx   eax, BYTE PTR [rsp+15]
        test    al, al
        je      .L7
        mov     edi, OFFSET FLAT:.LC3
        call    puts
.L5:
        xor     eax, eax
        add     rsp, 24
.LCFI1:
        ret

And of course when we compare char, int, or other multiple-bit values for truthiness, the compiler makes no such assumption that the value can be logically negated by bitwise XOR; instead it uses jne in place of the je instruction. (Maybe someone with more knowledge of the compiler can say why GCC with -O0 uses an xor at all, when testing the negation of a bool.)

Posted in Code | 59 replies

59 Responses to Both true and false: a Zen moment with C

  • “Doctor, it hurts when I do this!”
    “Don’t do that.”

  • Ryan says:

    Newer version of gcc should catch unitialized variables via -Wall

    $ cat uninitialized.c
    int main(void)
    {
    int foo;
    return foo;
    }
    $ gcc -Wall -o uninitialized uninitialized.c
    uninitialized.c: In function ‘main’:
    uninitialized.c:4:5: warning: ‘foo’ is used uninitialized in this function [-Wuninitialized]

    Personally I use -W -Wall -Wextra -pedantic or similar

    • Mark says:

      good advice, though I was compiling with -Wall already — in the real application, the boolean in question was an uninitialized struct element, which GCC did not warn me about.

      • Brent Wilson says:

        It’s too pedantic for some people (I don’t use it), but just as an FYI: -Weffc++

      • Angus says:

        Thanks for a really interesting post, Mark.

        I was curious about how obscure/redirected the struct needed to be to defeat the compiler warning so I had a quick play. Answer: minimally obscure.
        http://pastebin.com/raw.php?i=TSS7VA1m

        Neither “gcc -Wall -Wextra” nor “g++ -Weffc++ -Wall Wextra” (thanks for the tip Brent) warn on that one, although it feels to me like they should be able to.

        Valgrind finds it every time, though, as you’d expect it to. *Hugs valgrind patiently*.

  • erdem says:

    There seems to be a problem with the -O2 gist, both are je into the puts.

    • Mark says:

      Sorry for being unclear, the -O2 gist is correct though. Where I saw jne was compiling with -O0 with the same compiler, but testing a char rather than a bool variable.

  • kikito says:

    This is yet another thing that golang improves over C. You can’t define “undefined” variables. Once you define them, they have a default value (i.e. 0 for integers, etc).

    • gogo says:

      So is Go perfectly specified, i.e., are there no places in the spec where any behavior at all is left up to the implementation?

      That seems like a good idea, except that it seems like it would make the specification about 5 times bigger.

      • George K says:

        Not necessarily. A fully and rigorously specified programming language does not entail a huge, complex language. Standard ML is the main example of a fully and rigorously specified language (specified in terms of exact mathematical constructs/operational semantics rather than imprecise prose). The _Definition of Standard ML_ (i.e., the print version of the specification) fully specifies the general purpose language in fewer than 128 pages. Considering that Standard ML has a fairly sophisticated module and type system, this is significant. Researchers have made great strides in applying this technique to model other existing languages (e.g., Featherweight Java) and discovering a good number of bugs in the process.

    • Derek says:

      C assumes that you know what you are doing. The weakness that you say is also the C’s strength. C is a system programming language and the reason why it is so common among operating systems it is because it is very close to machine language. When writing an OS you are often doing many operations that often don’t even make sense to an application programmer. It would be much harder to code that when a programming language is doing additional things behind your back.

      “Golang is better because it does not do this and that.” and so does JavaScript, Python, and ton of other languages, but you won’t write an Operating Systems using them.

      Also it’s not hard to modify compiler to always initialize defined variables. In fact C actually does that to all variables defined as static (because of their purpose). The reason C does some things that intuitively might not seem right for an application programmer is one of the reason why C is still relevant 40 years after it was made, and nothing replaced it so far. If you use C and it constantly does not do what you want then perhaps you are using the wrong tool for your job.

      • Geoffrey says:

        But that’s not even true of C. The Linux kernel had a ton of problems recently when they were trying to do null pointer checks (for security) and the C language defined that these checks don’t make sense and therefore empowered the compiler to optimize them out. That would have made perfect sense for an application program, but doesn’t make sense in the kernel.

        It would be nice to have a language where what you say is in fact what gets executed, and that is close to the machine language but offers structure, but C isn’t it. C’s the closest thing we have to it, though.

        • Dan Neely says:

          Is there a good writeup of what happened and how the kernel team forced GCC to do what they wanted instead?

      • Christian Sciberras says:

        Derek, that’s bullshit.

        You don’t write OSes in Python, JS etc, for architectural reasons as well as the fact that you’re most probably not going to have to either.

        How many CompSci courses ask students to write an OS in, say, C# or Pascal?

        C got famous for this task, true. But that’s the end of it. There’s no ulterior reasons why it is better at it than anyone else. In fact, given it’s time, it might even be a bad choice for this task nowadays.

    • dig says:

      This is not necessarily an improvement.

  • Nathaniel Reindl says:

    Instead of getting only one of two possible eigenstates here, you got both of them. I’d call that a successful quantum physics experiment.

    Have you considered changing careers? (;

  • sohn says:

    I ran into a really fun bug at work yesterday, where I discovered that my C program was branching down logically inconsistent code paths.

    LOL LET’S DECIDE THE PROGRAM FLOW DEPENDING ON UNINITIALISED VARIABLES

    And then, when the program fails…

    LOL C SUCKS

    • Mark says:

      Nope, I’m not claiming C sucks by any means — there are very good reasons we’re using it for this project, aside from the fact that it’s the default language in this space. I just thought it was fun to see what the compiler was doing under the hood here.

    • anonymous says:

      I think there’s something wrong with you.

      • Jan says:

        No, I suspect there is something lacking – if you have read all the way down to here – yet don’t “get” why someone would be interested in this.

    • Christian Sciberras says:

      Well, C sucks, uninitialized variables or not.

  • Fulea Ștefan says:

    First, you demonstrate ignorance talking about C language with the “bool” data type. “bool” isn’t a C keyword, it’s a C++ one. The most remotely similar keyword in C is “_Bool”, a C99 standard addition. The original C didn’t had any special Boolean data type. The decisions were taken whether the expression was or wasn’t 0, that was it. That leads to the second point (which you may consider it as just personal sentiment) – it isn’t really needed a Boolean type in C. That kind of addition is purely a change for the sake of change. An “int” or merely a “char” did the job just fine.

    • Mark says:

      bool is defined to _Bool in stdbool.h, as used above. Yes, this C, not C++, for what that’s worth.

      http://pubs.opengroup.org/onlinepubs/007904875/basedefs/stdbool.h.html

      • Anon says:

        Just because something is written in C, dose not mean it is a part of C. Looking at the link you provided is for a POSIX header, this is not a C problem.

    • nick says:

      It is needed because bool has some semantic value. By your logic, why have char if int can take care of it?

      • mpu says:

        It is needed because bool has some semantic value. By your logic, why have char if int can take care of it?

        int and char are conventionally different widths.

      • Joshua says:

        Int can’t take care of char. They are two different sizes.

        • Dougall says:

          And int and char can’t take care of bool – they have different ranges. By having this distinction at the language level, the compiler knows that the value is either 0 or 1, and it can produce far more sensible code. To negate a GCC int on x86, you need:


          test eax, eax
          jnz false

          true:
          mov eax, 1
          jmp end

          false:
          mov eax, 0

          end:

          To negate a GCC bool, you need:

          xor eax, 1

          • daef says:

            why not simply like:


            test eax, eax
            jz false
            mov eax, 0
            false:
            xor eax, 1

          • Dougall says:

            daef: At the end of your code, eax is always 1. Your intention may have been to mov eax, 1 – in which case it would work, and be slightly faster and branch slightly less – either way, we still need a branch and we can’t compare with the simplicity of the xor. Most of these speedups go away if we’re moving to another register at the same time. (I’m assuming this is why I haven’t come across your implementation in compiled C code.)

            It should be noted that this could also use the bool casting code from original post, followed by the xor. But all of these have slight overhead and add to the complexity of this extremely simple task. (Actually, xor is so fast that most of these probably take more than three times longer because of our looser types.)

  • Matt says:

    The real lesson here is that you should always compile with -Wall (at a minimum) and do not under any circumstances ignore compiler warnings.

    • Mark says:

      Absolutely, I agree! The real application in question is built with -Wall -Werror, in fact, but failed to catch this uninitialized boolean because it was an element of a struct (unlike in my simplified example).

  • Sam says:

    Fascinating seeing how this happens through assembly. Suddenly the “bizarre, makes no sense undefined behavior” thing makes sense– in fact it’s perfectly logical and almost even inevitable. Ahh the beauty of C, it will never grow old or stop amazing me.

  • concerned_bloke says:

    I’m saddened by the low-quality responses on here. Everyone seems either eager to a) try and bash the bloke, or b) argue about languages. All you idiots speaking in platitudes about this or that

    THE REAL QUESTION I WANTED IN THE COMMENTS WAS WHY IS THE COMPILER DOING THAT

    you’d think the internet’s brightest and best would be able to answer PLEASE POST TO STACKOVERFLOW where someone will know (or I will)

    • Joshua says:

      This way a compiler doesn’t have to worry about the context of the operation !p. If you were to store the result of that into another boolean the operation would be faster. There would be no need for branching statements. Suppose not. If you did something like this, q = !p, the resulting pseudo-assembly would be like this:

      load p
      test p, 0×00
      jne .false
      store q, 0
      j .done
      .false:
      store q, 1
      .done:

      In the fashion it is done, the code is:
      load p
      xor p, 1
      store q, p

      *Note, my storage pseudo-assembly commands store the second operand into the first

    • Jahn says:

      Completely agree, every second idiot has something to pedantic to argue about but I applaud Mark for having the curiosity to investigate and then share his findings. Interesting read, pity about the immature responses.

  • Matt Pope says:

    Great post, and some rather amusing undefined behavior, heh. I think this would have made me tear my hair out! Thanks for going into detail with the assembly, it’s something I’ve been interested in but never grasped the basic concepts of it so its mainly unintelligible nonsense to me. Helped me understand the overlaying behavior with the C though. Cheers!

  • grogenaut says:

    -Wall -Werr or stfu

  • bobx says:

    Let’s all agree to learn C before we write code.

  • Yves Daoust says:

    Nice finding and interesting analysis. Thanks.

    A smarter optimizer should have translated this code as a pure no-op (printing neither true nor false), as any operation using an uninitialized value can be optimized away !

    More seriously, I am always amazed to see how numerous well-behaved C programs there are around, despite then number of undefined behaviors present in them and that the compilers do not tell us about.

  • TedvG says:

    Why all the fuzz here? No matter what programming language one uses, one should always define and initialize all variables, structures etc.
    right?

  • Dougall says:

    (Maybe someone with more knowledge of the compiler can say why GCC with -O0 uses an xor at all, when testing the negation of a bool.)

    While I haven’t read any GCC source code or talked to a GCC developer, I have spent the past couple of months working a decompiler for x86 GCC code, which involves a lot of staring at compiler output and reading a handful of compiler optimisation papers.

    The short answer is that the binary negation on a “bool” type value is defined by gcc as “value ^ 1″. Probably because it’s significantly faster, at least when not conditional branching. (! (unary not) on numeric values relies on branching (test, cmp, jnz…) to get a 1 or 0 value, and “xor” is really fast)

    The longer answer:

    Compiler optimisations are procedures which analyse the flow of control and data throughout a program and modify it into a more optimal form. A good example of this is the removal of a variables from the stack into a register. While this sounds simple, every use of each variable must be analysed to ensure that a pointer to it is never required, and all of those should also be analysed to choose the most sensible ones to keep in registers. A very similar process is recognising constant local variables (whose value is never changed and whose address is never acquired) and substituting the value in every use throughout the functions. A more extreme example is finding a branch (if/else) after which the same operations are performed and moving those operations to before the branch.

    Code generation has it’s own “optimising” behaviours, but this does not look at the bigger picture or analyse surrounding code, and is not considered an optimisation in GCC. The best example of this is the % (modulo) operator, which is implemented in x86 as part of the large, inefficient ‘idiv’ instruction. If given a constant operand, however, GCC at -O0 will produce some of the most cryptic x86 I have seen a compiler produce. Note the complete absence of the number three in the assembly in the following example. If you’d like to explain how this works, be my guest.


    main(){
    volatile int i = 13;
    printf("%d", i % 3);
    }


    mov [ebp-0xc], 13 ; [ebp-0xc] is i
    mov eax, [ebp-0xc]
    mov ecx, esp ; in no way related to the modulo
    mov edx, 0xaaaaaaab
    mov [ebp-0x10], eax ; save eax
    imul edx
    mov eax, [ebp-0x10] ; restore eax
    lea edx, [edx+eax]
    mov esi, edx
    shr esi, 0x1f
    sar edx, 1
    lea edx, [edx+esi]
    mov esi, edx
    lea edx, [edx+esi*2]
    sub eax, edx
    mov [ecx+0x4], eax ; at this point, somehow, eax = i % 3
    mov [ecx], 0x1fa8 ; "%d"
    call 0x1f82

    Where the separation between optimisation and code-generation lies is unclear to me. It may be somewhat arbitrary or based solely on where things are easiest to implement (without slowing down compilation of course). Certain multiplication/addition patterns will sometimes get converted to ‘lea’ and ‘shl’ instructions (some x86 Arithmetic Idioms) at higher optimisation levels, but are often left untouched without optimisation. Perhaps this is because they involve a larger scope (not just handling one operator), because they are special cases applicable only to certain values, or because there are so many different methods that must be weighed up for each case.

  • Francesco says:

    And you are very lucky indeed since “a Ninja could have entered your place and kill you” since this would have fit the definition of undefined behaviour as well.

    Anyway, I agree with everyone else in the thread on the fact that you should have used -Wall and -pedantic.

  • Tom says:

    Curiously, gcc 4.6.3 on Ubuntu 32-bit in a VMware VM produces identical assembler but a different result – the stack seems to end up zero-initialised and so the variable is correctly false.

    • Mark says:

      Sorry that I wasn’t as clear as I should have been about this… yep, the simplified example I presented above will not exhibit this behavior, since it performs its tests before the process has done anything to muck up its stack address space (and as you observed, the process’s initial stack address space is zeroed out by the kernel). This works the same on x86-64, too.

      That example was only meant to show the kind of machine code produced, which would cause that behavior to be exhibited if the stack contained arbitrary non-zero data.

      I’ve added a better (but longer) example here, which intentionally causes a stack write before performing this test, in order to demonstrate the actual behavior. This works on x86-64 Linux when compiled without optimization, anyway:

      https://gist.github.com/3001231#file_bool2.c

  • T800 says:

    VC++ will give warning for uninitialized variables :P

    • Tom says:

      Hmmm. Either you didn’t bother to read any of the other comments, or you did and just can’t resist a snide remark anyway. Try compiling this in VC++2010:


      #include
      #include

      typedef unsigned char bool;

      struct foo {
      int x;
      float y;
      bool z;
      };

      int main() {
      struct foo *a = malloc(sizeof(struct foo));
      if(a->z)
      puts("z is true\n");
      else
      puts("z is not true\n");

      if(!a->z)
      puts("z is false\n");
      else
      puts("z is not false\n");
      }

      Do you see a compiler warning? Nope, me neither.

  • Duncan says:

    Schroedinger’s Boolean!

  • Alex says:

    Ahh forum trolls, gotta love… well… actually hate them.

    This is a fantastic article, not from any attempt at conveying the lesson that you shouldn’t use uninitialized variables which the trolls seem to have pounced upon, but rather from sharing some very interesting information about what actually happens when you do.

    It’s the fundamental difference between knowing not to do something (shallow knowledge) vs. understanding why you shouldn’t do something and what will happen if you choose to ignore, or forget to follow, what you know (deep knowledge). I found it a very interesting read.

    Keep up the good work!

  • Ernie says:

    Real programmers program in assembly language! LOL

  • Resuna says:

    They should never have created a “bool” type in the first place.

  • khem says:

    Well boolean type is new in c99 and its size is not defined usually
    Its assigned the smallest size which is byte
    But since its included as fundamental type compiler can assume
    it to be having either 0 or 1

    Best would be if you got a warning here but compiler may not catch
    All cases

  • asf says:

    A similar thing can happen in the JVM. It’s an unlikely scenario, but it may occur when code from two different compilers are running together, if the two compilers disagree on whether >1 values are valid integers. All Java compilers I know of will only generate 0-or-1 booleans, but the JVM allows any byte to be a boolean.

    Although the Java specification says there are only two boolean values, it is possible for the Java expression “x == true || x == false” to evaluate to false when compiled with a mainstream Java compiler.

  • GrayGaffer says:

    Other than defining boolean tests as returning false for 0 else true, the problem is not so much with the language itself as with the conventional implementations. Even without the uninitialized variable issues, there is still a problem: (x & TRUE) needs to be FALSE only if x is 0, but the only way to guarantee that is if TRUE is defined as int -1. But the convention is using something like typedef enum { FALSE, TRUE ) BOOL; (take your pick on spelling or even if enums are used), which by the language definition uses the smallest type that can express all the values in the enum. As someone who started off as a hardware logic engineer then moved on to becoming a programmer, I can tell you it was a painful surprise discovering this, when I was expecting bit-wise Boolean algebra to work with the compiler-defined Boolean values. It plain does not work.

    And, s I further was a datacomms specialist, it was equally painful discovering the problems with enum sizing – you cannot use enums to define fields in a data packet – or equivalently define a struct over a data packet – and expect the fields to line up properly while at the same time expecting to be able to use enums for defining the states of fields like commands and status. Which means you cannot user the innate knowledge in the debuggers to display such fields using the symbols from an enum set. Bear in mind that in datacomms one has to handle line ordered data packets passing between or through nodes of wildly differing architectures programmed in wildly different languages. Including down to hardware bit level languages – some protocols were defined for hardware handling in the bad old days, but hung around to the days when hardware became generic and code took over the parsing job. Think SDLC/HDLC families (from which IP and X.25 descend, amongst others), with bit fields and multi-byte encoded values. Initially coded in Z80 or 8048 assembler.

    It frankly sucks. But kept me gainfully employed for many years.

  • Ashbad says:

    Interesting read, didn’t know that GCC assembled operations on bool types to be like that under x86. And, of course, it’s a good testament to always make sure you initialize everything.

  • Justin says:

    I don’t program in C on a regular basis, but I find that C always has the most interesting and peculiar bugs. Fun to read about.

  • Balakrishnan B says:

    Doesn’t it cost in run-time to default initialize all stack variables? Is there some way to prevent that?

  • I did some programming in C but didn’t know anything about this. So much to learn with C even today.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>