60

To achieve type safety with enums in C is problematic, since they are essentially just integers. And enumeration constants are in fact defined to be of type int by the standard.

To achieve a bit of type safety I do tricks with pointers like this:

typedef enum
{
  BLUE,
  RED
} color_t;

void color_assign (color_t* var, color_t val) 
{ 
  *var = val; 
}

Because pointers have stricter type rules than values, so this prevents code such as this:

int x; 
color_assign(&x, BLUE); // compiler error

But it doesn't prevent code like this:

color_t color;
color_assign(&color, 123); // garbage value

This is because the enumeration constant is essentially just an int and can get implicitly assigned to an enumeration variable.

Is there a way to write such a function or macro color_assign, that can achieve complete type safety even for enumeration constants?

2

4 Answers 4

58

It is possible to achieve this with a few tricks. Given

typedef enum
{
  BLUE,
  RED
} color_t;

Then define a dummy union which won't be used by the caller, but contains members with the same names as the enumeration constants:

typedef union
{
  color_t BLUE;
  color_t RED;
} typesafe_color_t;

This is possible because enumeration constants and member/variable names reside in different namespaces.

Then make some function-like macros:

#define c_assign(var, val) (var) = (typesafe_color_t){ .val = val }.val
#define color_assign(var, val) _Generic((var), color_t: c_assign(var, val))

These macros are then called like this:

color_t color;
color_assign(color, BLUE); 

Explanation:

  • The C11 _Generic keyword ensures that the enumeration variable is of the correct type. However, this can't be used on the enumeration constant BLUE because it is of type int.
  • Therefore the helper macro c_assign creates a temporary instance of the dummy union, where the designated initializer syntax is used to assign the value BLUE to a union member named BLUE. If no such member exists, the code won't compile.
  • The union member of the corresponding type is then copied into the enum variable.

We actually don't need the helper macro, I just split the expression for readability. It works just as fine to write

#define color_assign(var, val) _Generic((var), \
color_t: (var) = (typesafe_color_t){ .val = val }.val )

Examples:

color_t color; 
color_assign(color, BLUE);// ok
color_assign(color, RED); // ok

color_assign(color, 0);   // compiler error 

int x;
color_assign(x, BLUE);    // compiler error

typedef enum { foo } bar;
color_assign(color, foo); // compiler error
color_assign(bar, BLUE);  // compiler error

EDIT

Obviously the above doesn't prevent the caller from simply typing color = garbage;. If you wish to entirely block the possibility of using such assignment of the enum, you can put it in a struct and use the standard procedure of private encapsulation with "opaque type":

color.h

#include <stdlib.h>

typedef enum
{
  BLUE,
  RED
} color_t;

typedef union
{
  color_t BLUE;
  color_t RED;
} typesafe_color_t;

typedef struct col_t col_t; // opaque type

col_t* col_alloc (void);
void   col_free (col_t* col);

void col_assign (col_t* col, color_t color);

#define color_assign(var, val)   \
  _Generic( (var),               \
    col_t*: col_assign((var), (typesafe_color_t){ .val = val }.val) \
  )

color.c

#include "color.h"

struct col_t
{
  color_t color;
};

col_t* col_alloc (void) 
{ 
  return malloc(sizeof(col_t)); // (needs proper error handling)
}

void col_free (col_t* col)
{
  free(col);
}

void col_assign (col_t* col, color_t color)
{
  col->color = color;
}

main.c

col_t* color;
color = col_alloc();

color_assign(color, BLUE); 

col_free(color);
7
  • This is really cute, although it won't catch some mistakes: int zonk(int x) {color_t color; color = x; return color;}
    – gsg
    Commented Mar 27, 2017 at 11:08
  • @gsg You will obviously have to disallow direct assignments. That can achieved by for example embedding the enum in a struct and then make the struct an opaque type.
    – Lundin
    Commented Mar 27, 2017 at 11:15
  • @gsg I added an example with private encapsulation which blocks direct assignments.
    – Lundin
    Commented Mar 27, 2017 at 11:33
  • Am I missing something, or is it impossible to use color_assign with a value from a variable, typed or otherwise? Since the macro also uses the "expression" as the field name. How do you actually do anything with these values in a type-safe way? Commented Mar 27, 2017 at 17:56
  • 1
    @anicicn It creates a temporary variable of the union type through a so-called compound literal (C99 feature), then initializes a specific member of this temporary union variable (through designated initializers, another C99 feature). In no member with a matching name exists in the union, then the code won't compile. If the member matches, as in the case with RED, the union member RED will get assigned the value RED. By typing .val in the end, the code accesses that very member and copies it into the destination variable. In practice I believe most of this code will get optimized away.
    – Lundin
    Commented Mar 29, 2017 at 9:03
9

The top answer's pretty good, but it has the downsides that it requires a lot of the C99 and C11 feature set in order to compile, and on top of that, it makes assignment pretty unnatural: You have to use a magic color_assign() function or macro in order to move data around instead of the standard = operator.

(Admittedly, the question explicitly asked about how to write color_assign(), but if you look at the question more broadly, it's really about how to change your code to get type-safety with some form of enumerated constants, and I'd consider not needing color_assign() in the first place to get type-safety to be fair game for the answer.)

Pointers are among the few shapes that C treats as type-safe, so they make a natural candidate for solving this problem. So I'd attack it this way: Rather than using an enum, I'd sacrifice a little memory to be able to have unique, predictable pointer values, and then use some really hokey funky #define statements to construct my "enum" (yes, I know macros pollute the macro namespace, but enum pollutes the compiler's global namespace, so I consider it close to an even trade):

color.h:

typedef struct color_struct_t *color_t;

struct color_struct_t { char dummy; };

extern struct color_struct_t color_dummy_array[];

#define UNIQUE_COLOR(value) \
    (&color_dummy_array[value])

#define RED    UNIQUE_COLOR(0)
#define GREEN  UNIQUE_COLOR(1)
#define BLUE   UNIQUE_COLOR(2)

enum { MAX_COLOR_VALUE = 2 };

This does, of course, require that you have just enough memory reserved somewhere to ensure nothing else can ever take on those pointer values:

color.c:

#include "color.h"

/* This never actually gets used, but we need to declare enough space in the
 * BSS so that the pointer values can be unique and not accidentally reused
 * by anything else. */
struct color_struct_t color_dummy_array[MAX_COLOR_VALUE + 1];

But from the consumer's perspective, this is all hidden: color_t is very nearly an opaque object. You can't assign anything to it other than valid color_t values and NULL:

user.c:

#include <stddef.h>
#include "color.h"

void foo(void)
{
    color_t color = RED;    /* OK */
    color_t color = GREEN;  /* OK */
    color_t color = NULL;   /* OK */
    color_t color = 27;     /* Error/warning */
}

This works well in most cases, but it does have the problem of not working in switch statements; you can't switch on a pointer (which is a shame). But if you're willing to add one more macro to make switching possible, you can arrive at something that's "good enough":

color.h:

...

#define COLOR_NUMBER(c) \
    ((c) - color_dummy_array)

user.c:

...

void bar(color_t c)
{
    switch (COLOR_NUMBER(c)) {
        case COLOR_NUMBER(RED):
            break;
        case COLOR_NUMBER(GREEN):
            break;
        case COLOR_NUMBER(BLUE):
            break;
    }
}

Is this a good solution? I wouldn't call it great, since it both wastes some memory and pollutes the macro namespace, and it doesn't let you use enum to automatically assign your color values, but it is another way to solve the problem that results in somewhat more natural usages, and unlike the top answer, it works all the way back to C89.

8
  • 2
    Using C11 features is not a legitimate downside. Commented Mar 27, 2017 at 17:58
  • 12
    It is a downside if your compiler doesn't support C11 features. I won't name any names (coughMicrosoftcough) but there are a number of "C" compilers out there that can't handle C11. Commented Mar 27, 2017 at 18:11
  • 1
    Interesting idea. You should consider hiding away the struct definition entirely though, so nobody gets the idea of using it or accessing the members. This can be done with opaque types as shown with the edit in my answer. Also, if you declare the struct const you don't waste space in .bss but rather in .rodata or something like that. And what about color_t color = 0;? Or worse: any expression that evaluates to 0 in compile time.
    – Lundin
    Commented Mar 28, 2017 at 7:06
  • const is definitely a good idea; putting it in the text/read-only segments is worth a little more effort. That said, in similar scenarios, I've often named the internal property opaque_, which certainly isn't foolproof, but which has been more than good enough in the past to keep dirty hands out of the cookie jar. Commented Mar 28, 2017 at 15:00
  • As for the zero issue, that's an issue, to be sure, but it's an issue that's shared with every other pointer type all over C: Yes, you can write color = 0, but it's not the same as color = BLACK, any more than string = 0 is the same as string = "". A switch statement can even identify NULL and either handle it in its default case, or even have a special case for NULL itself. You end up with an enum that effectively has an additional non-value, but considering how many real-world enums have something like DEFAULT = 0 already, I don't consider that a detriment to the technique. Commented Mar 28, 2017 at 15:04
8

One could enforce type safety with a struct:

struct color { enum { THE_COLOR_BLUE, THE_COLOR_RED } value; };
const struct color BLUE = { THE_COLOR_BLUE };
const struct color RED  = { THE_COLOR_RED  };

Since color is just a wrapped integer, it can be passed by value or by pointer as one would do with an int. With this definition of color, color_assign(&val, 3); fails to compile with:

error: incompatible type for argument 2 of 'color_assign'

     color_assign(&val, 3);
                        ^

Full (working) example:

struct color { enum { THE_COLOR_BLUE, THE_COLOR_RED } value; };
const struct color BLUE = { THE_COLOR_BLUE };
const struct color RED  = { THE_COLOR_RED  };

void color_assign (struct color* var, struct color val) 
{ 
  var->value = val.value; 
}

const char* color_name(struct color val)
{
  switch (val.value)
  {
    case THE_COLOR_BLUE: return "BLUE";
    case THE_COLOR_RED:  return "RED";
    default:             return "?";
  }
}

int main(void)
{
  struct color val;
  color_assign(&val, BLUE);
  printf("color name: %s\n", color_name(val)); // prints "BLUE"
}

Play with in online (demo).

3
  • I believe this forces you to use different names for the enums and the const structs though.
    – Lundin
    Commented Mar 27, 2017 at 11:16
  • @Lundin It does: each color get a private (or internal) name (enum) and a public one (const struct). I don't see it as a drawback.
    – YSC
    Commented Mar 27, 2017 at 11:18
  • 1
    I really like your solution But I think it can be done even more simple, because the assignment operator works with structs. This means that you could get rid of the color_assign() function completely, and perform a "right to the point" hardcore struct color mycol; mycol=BLUE; (or am I missing something and some safety would be lost by going that way?)
    – cesss
    Commented Oct 31, 2022 at 19:18
7

Ultimately, what you want is a warning or error when you use an invalid enumeration value.

As you say, the C language cannot do this. However you can easily use a static analysis tool to catch this problem - Clang is the obvious free one, but there are plenty of others. Regardless of whether the language is type-safe, static analysis can detect and report the problem. Typically a static analysis tool puts up warnings, not errors, but you can easily have the static analysis tool report an error instead of a warning, and change your makefile or build project to handle this.

5
  • 1
    Obviously static analysers is always an option, for example a MISRA-C:2012 checker would catch enum type issues. The main problem with all static analysers on the market is that they are so full of bugs/"false positives", that they are not very useful. If you can force a compiler diagnostic by any standard C compiler, that's always the preferred solution.
    – Lundin
    Commented Mar 27, 2017 at 13:18
  • 2
    @Lundin My experience of static analysis isn't that it's full of bugs, but that idiomatic C will frequently break coding standards - "if(ptr)" as a check for non-NULL, for example. Much of the effort of static analysis does have to go into refining your ruleset. OTOH, once you've done that, then you have a very powerful tool which really will improve your code.
    – Graham
    Commented Mar 27, 2017 at 16:25
  • @Lundin Adding redundant functions and macros to code seems to be increasing complexity, ultimately reducing the code quality. The time spent implementing and reworking previous code IMHO is better spent using the static analysis tools.
    – B. Wolf
    Commented Mar 27, 2017 at 23:13
  • @Graham if(ptr) is rather sloppy but wide-spread practice than idiomatic, which would be if(ptr != NULL). Anyway, this isn't why most static analysers are bad, but rather scenarios such as type x; type_init(&x); And then you get "warning! x is not initialized when passed to the function!". Yes... thanks for letting me know that my variable isn't initialized, before it is initialized. As in, a failure to properly analyse across translation units.
    – Lundin
    Commented Mar 28, 2017 at 6:49
  • 1
    @B.Wolf Ideally you will have multiple ways of bug prevention. If you have compile-time assertion and manual code review and static analysis, you improve code quality much more than if you don't have all of those.
    – Lundin
    Commented Mar 28, 2017 at 6:51

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.