There are lots of opinions about flags in databases. So the common answer is "well, it depends what you want your RBDMS to be doing".
The student information system I work with on a daily basis has a status flag in the base student table. The legal values are A - Active, I - Inactive, P - Pre-registered, and G - Graduated. There's no validation table or lookup table for this. It's hard-coded in the application. While relationally that's a problem, the application works perfectly. A student always has one and exactly one status, and there's no situation not covered by the existing status lists. You could add a regtb_status
lookup table and add a foreign key constraint to the student registration table, but it doesn't add much to this application.
For your Booking example, I would have a current status field in the Booking table itself. I would prefer to use a character field so I could support the statuses that I know I might need: A - Active, C - Cancelled by Customer, I - Invalid, D - Deleted by Staff, etc. You can even allow the customer to have access to the validation table so they can create custom statuses if they want. It depends on the workflow you're envisioning and your customers want.
Elsewhere in the same system, there are a lot of status flag fields that are hard coded CHAR(1)
fields that are Y - Yes and N - No. You probably should use your RDBMS's boolean types for these flags, but unless you're talking ridiculous numbers of records or need to worry about internationalization, it's not going to be an issue. These types of tables are typically also functioning as junction tables. For example, the table that relates students to contacts includes status flags for whether the contact is living with the student, the type of contact (guardian, emergency contact), what the contacts relationship is to the student (mother, father, aunt, etc.), whether or not that contact should have access to the student in the parent website, the order of priority of the contacts, whether the parent should receive report cards in the mail, etc. This particular table is somewhat cumbersome simply because there are over a dozen flag fields in this table, but the multiple flag options relationship type are completely configurable in validation/lookup tables within the application and the column names are, at least in part, self-documenting. From a report-writing standpoint that's invaluable.
We have a few fields that are stored in user-defined tables, which actually store everything in an EAV table in the DB. These cause a problem because, often, the particular EAV record doesn't exist until the school explicitly sets it. The application behaves as though null = No, but it can make writing reports and even searching in the application difficult. You can't look for field = 'N'
. You have to look for field = 'N' OR field IS NULL
. In the application's search system, you have to specify field <> 'Y'
because it doesn't handle nulls well in all cases. This is very confusing for users that can't wrap their heads around three valued logic. It's also fairly irritating for a DBA because the best way to view the data, a view, is not easily updated.
In my experience, bitmasks are almost always incorrect. They're very cumbersome and expensive to query against, not self-documenting, and generally a tremendous pain in the tail. I would rather see a series of BIT
/BOOLEAN
or CHAR
fields any day than a bitmask. If it has multiple attributes in a single field, it's going to be a tremendous problem.
For your SubscribersTwitterHandles question, I guess I'm a little confused. Why didn't they just add a column to the existing table? Is it a one-to-many relationship, or are there multiple Twitter Handle fields? Either your customers haven't given you their handle -- in which case it's explicitly ''
-- or it's the handle they gave you.
I guess my real question from a design standpoint: Are we creating flags or tags? In my mind a flag is something that has a one-to-one relationship with an existing entity in the database. That entity might be the junction between two entities, or it might be on the entity itself, but it always has a non-null value.
Tags, on the other hand, are arbitrary, potentially many-to-one or many-to-many, and in most situations are completely defined by the customer as an ad hoc means to group records.