C Language Enum Tips & Tricks

C Language provides a built-in enum type for listing types of things. The default implementation, however, leaves much to be desired. In this article I'm going to list some useful techniques I discovered working with C for the past few years.

The Basics

For all the examples in this article, I'm using C11 revision of the language, although the majority of code will work in the earlier standards as well.

The most straightforward enum definition looks like this:

enum Fruit {
  Apple,
  Banana,
  Orange,
};

In C, enum and struct types are namespaced so to make a variable, or a field of this type you have use enum Fruit as the type. If you want to avoid that, a typedef can be used:

typedef enum {
  Apple,
  Banana,
  Orange,
} Fruit;

This makes the enum anonymous as there is no identifier after enum but associates this full definition with the name Fruit. Older compilers did not support anonymous enum types, so you might see a dummy name used instead. This is the case for Win32 headers for example. Here's what that looks like:

typedef enum DUMMY_FRUIT {
  Apple,
  Banana,
  Orange,
} Fruit;

When the value of the enum constituent is not specified it is chosen to be equal to the previous one plus one. If the first variant does not have an explicit value, it is set to 0. So the previous definition is equivalent to:

typedef enum {
  Apple = 0,
  Banana = 1,
  Orange = 2,
} Fruit;

Sometimes it is useful to reserve 0 as an uninitialized value, so you might see this variant:

typedef enum {
  Apple = 1,
  Banana, // 2
  Orange, // 3
} Fruit;

Enum types are also sometimes used for flags where each variant has a single bit set:

typedef enum {
  Read    = 0b001,
  Write   = 0b010,
  Execute = 0b100,
} Permissions;

Binary literals were introduced only in recent C standards, so you might also see bit shifts used for the same purpose:

typedef enum {
  Read    = 1 << 0,
  Write   = 1 << 1,
  Execute = 1 << 2,
} Permissions;

Usage of enum types for flags is a bit controversial because now the variants are no longer exclusive, i.e. the value can have both Read and Write flags set at the same time. Since there is nothing better in C, I think it is an OK use case, but I will not focus on it in this article.

Name Strings

It is a very common need to be able to print the names of the enum values. C does not provide any built-in way to get the corresponding name from an enum value though. The only thing you can do is print it as a number:

typedef enum {
  Apple,
  Banana,
  Orange,
} Fruit;

int main(void) {
  Fruit fruit = Apple;
  printf("%d", fruit);
  return 0;
}

This is not really useful neither for the user, nor for debugging purposes. You can of course write a function that would give you back the name based on the value:

const char *fruit_name(Fruit fruit) {
  switch(fruit) {
    case Apple: return "Apple";
    case Banana: return "Banana";
    case Orange: return "Orange";
    default: assert(!"Unknown fruit");
  }
}

Having to maintain this mapping function can be error-prone in case you add new fruit types to the definition but forget to update this function. Luckily modern compilers provide a warning to verify that all enum values are checked inside a switch statement. In GCC and Clang it is enabled via -Wswitch-enum command line flag. MSVC has two similar warnings - one that allows the use of the default case and one that does not. I prefer the latter even though it leads to a larger switch statement.

Another desynchronization case can happen if you are using an IDE to rename one of the values an enum. The IDE will update the variant usages, but not the string with the name, so you can end up with something like:

case Pear: return "Apple";

CLion IDE has an option to search for the renamed value in strings, but oftentimes the results can be too noisy.

A solution to all of these problems is to use an X Macro. First, we start with defining a macro like the following:

#define FRUIT_ENUM(VARIANT)\
  VARIANT(Apple)           \
  VARIANT(Banana)          \
  VARIANT(Orange)

The idea is to have a function-like macro that accepts another function-like macro as an argument and calls with each of the values. Here's how we use it:

#define FRUIT_ENUM_VARIANT(NAME) NAME,

typedef enum {
  FRUIT_ENUM(FRUIT_ENUM_VARIANT)
} Fruit;

The macro FRUIT_ENUM_VARIANT outputs the identifier of the variant followed by a comma making it a valid enum variant definition. If we expand the macros using -E command line toggle in GCC or Clang we can see that the output is what we expect:

typedef enum {
  Apple, Banana, Orange,
} Fruit;

C function-like macros also allow to turn their arguments into strings using # prefix. We can use this feature to define our fruit_name function:

#define FRUIT_ENUM_STRING(NAME) case NAME: return #NAME;

const char *fruit_name(Fruit fruit) {
  switch(fruit) {
    FRUIT_ENUM(FRUIT_ENUM_STRING)
    default: assert(!"Unknown fruit");
  }
}

This is pretty neat! When we add or remove entries from the FRUIT_ENUM macro the rest of the code will always match.

You can also add a reverse function pretty easily:

#define FRUIT_ENUM_FROM_STRING(NAME)          \
  if (strcmp(string, #NAME) == 0) return NAME;

Fruit fruit_from_string(const char *string) {
  FRUIT_ENUM(FRUIT_ENUM_FROM_STRING)
  assert(!"Unknown fruit");
}

It is also possible to define the FRUIT_NAME to have values:

#define FRUIT_ENUM(VARIANT)\
  VARIANT(Apple, 1)        \
  VARIANT(Banana, 2)       \
  VARIANT(Orange, 3)

Unlike the manual enum definition you can not easily mix and match automatic and manual values without pretty horrible macro hacks.

Making this change will break all of our previous code as VARIANT receives 2 arguments, but our implementations expect only one. For the enum definitions we have to adjust the macro to take the value as well:

#define FRUIT_ENUM_VARIANT(NAME, VALUE) NAME = (VALUE),

typedef enum {
  FRUIT_ENUM(FRUIT_ENUM_VARIANT)
} Fruit;

For the rest of the usages we can make use of varargs macro feature, allowing us to use the definitions that would work either way:

#define FRUIT_ENUM_STRING(NAME, ...) case NAME: return #NAME;
#define FRUIT_ENUM_FROM_STRING(NAME, ...)     \
  if (strcmp(string, #NAME) == 0) return NAME;

Enum Size

Knowing the count of variants in the enum can be quite handy. The way to do this without X macros is:

typedef enum {
  Apple,
  Banana,
  Orange,
  
  Fruit__COUNT
} Fruit;

With this setup Fruit__COUNT will have the value 3 as expected. There is a problem though. We are no longer able to start the count at 1 or use any specific value without the danger of breaking:

typedef enum {
  Apple = 40,
  Banana = 50,
  Orange = 70,
  
  Fruit__COUNT // 71?!
} Fruit;

Even if you do have only automatic values, having a variant that is not truly a variant makes the warnings that check the switch statement I described above more annoying to use as now you have to include a case for the count.

You can of course define the count separately either via a #define or an anonymous enum, but you would then need to maintain it manually leading to all the same problems as with the string name function discussed above:

typedef enum {
  Apple = 40,
  Banana = 50,
  Orange = 70,
} Fruit;

#define FRUIT_COUNT 3

We can solve this with X macros though. Our FRUIT_ENUM definition allows mapping each variant to any sequence of tokens even if it is the same sequence for all the variants. Counting things is just adding + 1 for each of them. Putting these two things together we get:

#define PLUS_ONE(...) + 1

#define FRUIT_COUNT (0 FRUIT_ENUM(PLUS_ONE))

Now when you use FRUIT_COUNT it will expand to (0 + 1 + 1 + 1) giving us the number 3 as expected no matter what is the value of each of the variants.

The only problem with this approach is that if you use FRUIT_COUNT a lot you are forcing the compiler to evaluate that expression in each usage. This can be especially bad for the compilation speed if the enum is quite large.

You can go around this issue by using an anonymous enum with a single variant that will hold the count:

#define PLUS_ONE(...) + 1

enum { FRUIT_COUNT = (0 FRUIT_ENUM(PLUS_ONE)) }

Smaller Storage Classes

C standard has this paragraph describing the enum types:

The expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int.

On most modern systems this is a 32bit integer. Considering that the majority of enum types only have a handful of values, this is quite wasteful.

Sadly, there is no portable standard way to reduce the size of an enum. C++ supports it with enum Foo : char {...}, but that does not work in plain C. So what can we do?

If you are only targeting GCC and Clang there is a non-standard attribute that forces the enum to take only as many bytes as required by values:

typedef enum __attribute__ ((__packed__)) {
  FRUIT_ENUM(FRUIT_ENUM_VARIANT)
} Fruit;

static_assert(sizeof(Fruit) == 1, "Fruit must be a byte");

I'm using static_assert here to verify at compile time that the size of the type is what I expect it to be. I highly recommend using static_assert whenever you can to verify the assumptions you make in your code. The first argument to static_assert is a boolean condition, the second one is the message string.

The code above is not portable and will fail in MSVC and many other C compilers. The only other option is to use a different typedef for the user type:

typedef unsigned char Fruit;
enum {
  FRUIT_ENUM(FRUIT_ENUM_VARIANT)
} Fruit;

static_assert(sizeof(Fruit) == 1, "Fruit must be a byte");

This will work as expected, but there are a couple of new problems as well. The first and the more important one is the potential overflow. This can happen if we define some of our enum values to be larger than what the underlying type can fit:

#define FRUIT_ENUM(VARIANT)\
  VARIANT(Apple, 30000)    \
  VARIANT(Banana, 2)       \
  VARIANT(Orange, 3)

int main(void) {
  Fruit f = Apple; // Oops, this will overflow
  return f;
}

Depending on the compiler and the warnings you set, you might get a warning, but you might not. Would be great to make sure this does not compile. We can actually do that with a combination of a static assert, and a trick similar to the one we used for counting.

The first thing that we will need is the maximum value that can be fit into the actual type we use for storage:

#define MAX_VALUE_OF_SIZE(SIZE)\
    (SIZE) == 1 ? 0xFF :   \
    (SIZE) == 2 ? 0xFFFF : \
    (SIZE) == 4 ? 0xFFFFFFFF : \
    (SIZE) == 8 ? 0xFFFFFFFFFFFFFFFFll : \
    0 // This will always fail the assert

Next we add a static assert for each of the values of the enum:

#define ASSERT_FRUIT_VARIANT_FITS(NAME, VALUE)                          \
  static_assert(                                                        \
    ((unsigned long long)(VALUE)) <= (MAX_VALUE_OF_SIZE(sizeof(Fruit))),\
    "Fruit::" #NAME " enum variant is too big"                          \
  );

FRUIT_ENUM(ASSERT_FRUIT_VARIANT_FITS)

The cast to unsigned long long here is quite important. For the enum types we typically do not care if the value is signed or unsigned as long as it fits into the bit size. Treating it as the largest unsigned integer makes the comparison easier, as otherwise, we would have to check the negative minimum value as well.

With this code in place the compilation is going to fail as expected with a pretty readable error message:

"Fruit::Apple enum variant is too big"

The only remaining problem concerns debugging. If you try to look at a variable of type Fruit in the debugger it will treat as char and not show the enum variant name:

(gdb) print f
$1 = 2 '\002'

If the value is not used needed too often when debugging, just manually casting it to the enum can be good enough:

(gdb) print (enum Fruit)f
$2 = Banana

If you want this to happen automatically you can define a natvis file for MSVC or a pretty-printer for gdb.

Summary

C language is quite old and offers only the most basic features. At the same time, the tricks I described in this article fill in a lot of the missing features making them pretty pleasant to use.