Question

In a C++ project I'm working on, I have a flag kind of value which can have four values. Those four flags can be combined. Flags describe the records in database and can be:

  • new record
  • deleted record
  • modified record
  • existing record

Now, for each record I wish to keep this attribute, so I could use an enum:

enum { xNew, xDeleted, xModified, xExisting }

However, in other places in code, I need to select which records are to be visible to the user, so I'd like to be able to pass that as a single parameter, like:

showRecords(xNew | xDeleted);

So, it seems I have three possible appoaches:

#define X_NEW      0x01
#define X_DELETED  0x02
#define X_MODIFIED 0x04
#define X_EXISTING 0x08

or

typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;

or

namespace RecordType {
    static const uint8 xNew = 1;
    static const uint8 xDeleted = 2;
    static const uint8 xModified = 4;
    static const uint8 xExisting = 8;
}

Space requirements are important (byte vs int) but not crucial. With defines I lose type safety, and with enum I lose some space (integers) and probably have to cast when I want to do a bitwise operation. With const I think I also lose type safety since a random uint8 could get in by mistake.

Is there some other cleaner way?

If not, what would you use and why?

P.S. The rest of the code is rather clean modern C++ without #defines, and I have used namespaces and templates in few spaces, so those aren't out of question either.

Was it helpful?

Solution

Combine the strategies to reduce the disadvantages of a single approach. I work in embedded systems so the following solution is based on the fact that integer and bitwise operators are fast, low memory & low in flash usage.

Place the enum in a namespace to prevent the constants from polluting the global namespace.

namespace RecordType {

An enum declares and defines a compile time checked typed. Always use compile time type checking to make sure arguments and variables are given the correct type. There is no need for the typedef in C++.

enum TRecordType { xNew = 1, xDeleted = 2, xModified = 4, xExisting = 8,

Create another member for an invalid state. This can be useful as error code; for example, when you want to return the state but the I/O operation fails. It is also useful for debugging; use it in initialisation lists and destructors to know if the variable's value should be used.

xInvalid = 16 };

Consider that you have two purposes for this type. To track the current state of a record and to create a mask to select records in certain states. Create an inline function to test if the value of the type is valid for your purpose; as a state marker vs a state mask. This will catch bugs as the typedef is just an int and a value such as 0xDEADBEEF may be in your variable through uninitialised or mispointed variables.

inline bool IsValidState( TRecordType v) {
    switch(v) { case xNew: case xDeleted: case xModified: case xExisting: return true; }
    return false;
}

 inline bool IsValidMask( TRecordType v) {
    return v >= xNew  && v < xInvalid ;
}

Add a using directive if you want to use the type often.

using RecordType ::TRecordType ;

The value checking functions are useful in asserts to trap bad values as soon as they are used. The quicker you catch a bug when running, the less damage it can do.

Here are some examples to put it all together.

void showRecords(TRecordType mask) {
    assert(RecordType::IsValidMask(mask));
    // do stuff;
}

void wombleRecord(TRecord rec, TRecordType state) {
    assert(RecordType::IsValidState(state));
    if (RecordType ::xNew) {
    // ...
} in runtime

TRecordType updateRecord(TRecord rec, TRecordType newstate) {
    assert(RecordType::IsValidState(newstate));
    //...
    if (! access_was_successful) return RecordType ::xInvalid;
    return newstate;
}

The only way to ensure correct value safety is to use a dedicated class with operator overloads and that is left as an exercise for another reader.

OTHER TIPS

Forget the defines

They will pollute your code.

bitfields?

struct RecordFlag {
    unsigned isnew:1, isdeleted:1, ismodified:1, isexisting:1;
};

Don't ever use that. You are more concerned with speed than with economizing 4 ints. Using bit fields is actually slower than access to any other type.

However, bit members in structs have practical drawbacks. First, the ordering of bits in memory varies from compiler to compiler. In addition, many popular compilers generate inefficient code for reading and writing bit members, and there are potentially severe thread safety issues relating to bit fields (especially on multiprocessor systems) due to the fact that most machines cannot manipulate arbitrary sets of bits in memory, but must instead load and store whole words. e.g the following would not be thread-safe, in spite of the use of a mutex

Source: http://en.wikipedia.org/wiki/Bit_field:

And if you need more reasons to not use bitfields, perhaps Raymond Chen will convince you in his The Old New Thing Post: The cost-benefit analysis of bitfields for a collection of booleans at http://blogs.msdn.com/oldnewthing/archive/2008/11/26/9143050.aspx

const int?

namespace RecordType {
    static const uint8 xNew = 1;
    static const uint8 xDeleted = 2;
    static const uint8 xModified = 4;
    static const uint8 xExisting = 8;
}

Putting them in a namespace is cool. If they are declared in your CPP or header file, their values will be inlined. You'll be able to use switch on those values, but it will slightly increase coupling.

Ah, yes: remove the static keyword. static is deprecated in C++ when used as you do, and if uint8 is a buildin type, you won't need this to declare this in an header included by multiple sources of the same module. In the end, the code should be:

namespace RecordType {
    const uint8 xNew = 1;
    const uint8 xDeleted = 2;
    const uint8 xModified = 4;
    const uint8 xExisting = 8;
}

The problem of this approach is that your code knows the value of your constants, which increases slightly the coupling.

enum

The same as const int, with a somewhat stronger typing.

typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;

They are still polluting the global namespace, though. By the way... Remove the typedef. You're working in C++. Those typedefs of enums and structs are polluting the code more than anything else.

The result is kinda:

enum RecordType { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } ;

void doSomething(RecordType p_eMyEnum)
{
   if(p_eMyEnum == xNew)
   {
       // etc.
   }
}

As you see, your enum is polluting the global namespace. If you put this enum in an namespace, you'll have something like:

namespace RecordType {
   enum Value { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } ;
}

void doSomething(RecordType::Value p_eMyEnum)
{
   if(p_eMyEnum == RecordType::xNew)
   {
       // etc.
   }
}

extern const int ?

If you want to decrease coupling (i.e. being able to hide the values of the constants, and so, modify them as desired without needing a full recompilation), you can declare the ints as extern in the header, and as constant in the CPP file, as in the following example:

// Header.hpp
namespace RecordType {
    extern const uint8 xNew ;
    extern const uint8 xDeleted ;
    extern const uint8 xModified ;
    extern const uint8 xExisting ;
}

And:

// Source.hpp
namespace RecordType {
    const uint8 xNew = 1;
    const uint8 xDeleted = 2;
    const uint8 xModified = 4;
    const uint8 xExisting = 8;
}

You won't be able to use switch on those constants, though. So in the end, pick your poison... :-p

Have you ruled out std::bitset? Sets of flags is what it's for. Do

typedef std::bitset<4> RecordType;

then

static const RecordType xNew(1);
static const RecordType xDeleted(2);
static const RecordType xModified(4);
static const RecordType xExisting(8);

Because there are a bunch of operator overloads for bitset, you can now do

RecordType rt = whatever;      // unsigned long or RecordType expression
rt |= xNew;                    // set 
rt &= ~xDeleted;               // clear 
if ((rt & xModified) != 0) ... // test

Or something very similar to that - I'd appreciate any corrections since I haven't tested this. You can also refer to the bits by index, but it's generally best to define only one set of constants, and RecordType constants are probably more useful.

Assuming you have ruled out bitset, I vote for the enum.

I don't buy that casting the enums is a serious disadvantage - OK so it's a bit noisy, and assigning an out-of-range value to an enum is undefined behaviour so it's theoretically possible to shoot yourself in the foot on some unusual C++ implementations. But if you only do it when necessary (which is when going from int to enum iirc), it's perfectly normal code that people have seen before.

I'm dubious about any space cost of the enum, too. uint8 variables and parameters probably won't use any less stack than ints, so only storage in classes matters. There are some cases where packing multiple bytes in a struct will win (in which case you can cast enums in and out of uint8 storage), but normally padding will kill the benefit anyhow.

So the enum has no disadvantages compared with the others, and as an advantage gives you a bit of type-safety (you can't assign some random integer value without explicitly casting) and clean ways of referring to everything.

For preference I'd also put the "= 2" in the enum, by the way. It's not necessary, but a "principle of least astonishment" suggests that all 4 definitions should look the same.

Here are couple of articles on const vs. macros vs. enums:

Symbolic Constants
Enumeration Constants vs. Constant Objects

I think you should avoid macros especially since you wrote most of your new code is in modern C++.

If possible do NOT use macros. They aren't too much admired when it comes to modern C++.

Enums would be more appropriate as they provide "meaning to the identifiers" as well as type safety. You can clearly tell "xDeleted" is of "RecordType" and that represent "type of a record" (wow!) even after years. Consts would require comments for that, also they would require going up and down in code.

With defines I lose type safety

Not necessarily...

// signed defines
#define X_NEW      0x01u
#define X_NEW      (unsigned(0x01))  // if you find this more readable...

and with enum I lose some space (integers)

Not necessarily - but you do have to be explicit at points of storage...

struct X
{
    RecordType recordType : 4;  // use exactly 4 bits...
    RecordType recordType2 : 4;  // use another 4 bits, typically in the same byte
    // of course, the overall record size may still be padded...
};

and probably have to cast when I want to do bitwise operation.

You can create operators to take the pain out of that:

RecordType operator|(RecordType lhs, RecordType rhs)
{
    return RecordType((unsigned)lhs | (unsigned)rhs);
}

With const I think I also lose type safety since a random uint8 could get in by mistake.

The same can happen with any of these mechanisms: range and value checks are normally orthogonal to type safety (though user-defined-types - i.e. your own classes - can enforce "invariants" about their data). With enums, the compiler's free to pick a larger type to host the values, and an uninitialised, corrupted or just miss-set enum variable could still end up interpretting its bit pattern as a number you wouldn't expect - comparing unequal to any of the enumeration identifiers, any combination of them, and 0.

Is there some other cleaner way? / If not, what would you use and why?

Well, in the end the tried-and-trusted C-style bitwise OR of enumerations works pretty well once you have bit fields and custom operators in the picture. You can further improve your robustness with some custom validation functions and assertions as in mat_geek's answer; techniques often equally applicable to handling string, int, double values etc..

You could argue that this is "cleaner":

enum RecordType { New, Deleted, Modified, Existing };

showRecords([](RecordType r) { return r == New || r == Deleted; });

I'm indifferent: the data bits pack tighter but the code grows significantly... depends how many objects you've got, and the lamdbas - beautiful as they are - are still messier and harder to get right than bitwise ORs.

BTW /- the argument about thread safety's pretty weak IMHO - best remembered as a background consideration rather than becoming a dominant decision-driving force; sharing a mutex across the bitfields is a more likely practice even if unaware of their packing (mutexes are relatively bulky data members - I have to be really concerned about performance to consider having multiple mutexes on members of one object, and I'd look carefully enough to notice they were bit fields). Any sub-word-size type could have the same problem (e.g. a uint8_t). Anyway, you could try atomic compare-and-swap style operations if you're desperate for higher concurrency.

Even if you have to use 4 byte to store an enum (I'm not that familiar with C++ -- I know you can specify the underlying type in C#), it's still worth it -- use enums.

In this day and age of servers with GBs of memory, things like 4 bytes vs. 1 byte of memory at the application level in general don't matter. Of course, if in your particular situation, memory usage is that important (and you can't get C++ to use a byte to back the enum), then you can consider the 'static const' route.

At the end of the day, you have to ask yourself, is it worth the maintenance hit of using 'static const' for the 3 bytes of memory savings for your data structure?

Something else to keep in mind -- IIRC, on x86, data structures are 4-byte aligned, so unless you have a number of byte-width elements in your 'record' structure, it might not actually matter. Test and make sure it does before you make a tradeoff in maintainability for performance/space.

If you want the type safety of classes, with the convenience of enumeration syntax and bit checking, consider Safe Labels in C++. I've worked with the author, and he's pretty smart.

Beware, though. In the end, this package uses templates and macros!

Do you actually need to pass around the flag values as a conceptual whole, or are you going to have a lot of per-flag code? Either way, I think having this as class or struct of 1-bit bitfields might actually be clearer:

struct RecordFlag {
    unsigned isnew:1, isdeleted:1, ismodified:1, isexisting:1;
};

Then your record class could have a struct RecordFlag member variable, functions can take arguments of type struct RecordFlag, etc. The compiler should pack the bitfields together, saving space.

I probably wouldn't use an enum for this kind of a thing where the values can be combined together, more typically enums are mutually exclusive states.

But whichever method you use, to make it more clear that these are values which are bits which can be combined together, use this syntax for the actual values instead:

#define X_NEW      (1 << 0)
#define X_DELETED  (1 << 1)
#define X_MODIFIED (1 << 2)
#define X_EXISTING (1 << 3)

Using a left-shift there helps to indicate that each value is intended to be a single bit, it is less likely that later on someone would do something wrong like add a new value and assign it something a value of 9.

Based on KISS, high cohesion and low coupling, ask these questions -

  • Who needs to know? my class, my library, other classes, other libraries, 3rd parties
  • What level of abstraction do I need to provide? Does the consumer understand bit operations.
  • Will I have have to interface from VB/C# etc?

There is a great book "Large-Scale C++ Software Design", this promotes base types externally, if you can avoid another header file/interface dependancy you should try to.

If you are using Qt you should have a look for QFlags. The QFlags class provides a type-safe way of storing OR-combinations of enum values.

I would rather go with

typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;

Simply because:

  1. It is cleaner and it makes the code readable and maintainable.
  2. It logically groups the constants.
  3. Programmer's time is more important, unless your job is to save those 3 bytes.

Not that I like to over-engineer everything but sometimes in these cases it may be worth creating a (small) class to encapsulate this information. If you create a class RecordType then it might have functions like:

void setDeleted();

void clearDeleted();

bool isDeleted();

etc... (or whatever convention suits)

It could validate combinations (in the case where not all combinations are legal, eg if 'new' and 'deleted' could not both be set at the same time). If you just used bit masks etc then the code that sets the state needs to validate, a class can encapsulate that logic too.

The class may also give you the ability to attach meaningful logging info to each state, you could add a function to return a string representation of the current state etc (or use the streaming operators '<<').

For all that if you are worried about storage you could still have the class only have a 'char' data member, so only take a small amount of storage (assuming it is non virtual). Of course depending on the hardware etc you may have alignment issues.

You could have the actual bit values not visible to the rest of the 'world' if they are in an anonymous namespace inside the cpp file rather than in the header file.

If you find that the code using the enum/#define/ bitmask etc has a lot of 'support' code to deal with invalid combinations, logging etc then encapsulation in a class may be worth considering. Of course most times simple problems are better off with simple solutions...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top