Question

I was reading this article on BBC. It tells a story of a person named Jenifer Null and how she faces day to day problems while using online databases like booking plane tickets, net banking etc.

I am not well versed in databases and I do not use it very often. When I made a website for learning, the server side form validation used regular expressions. From what I remember it would happily accept the name "Null". I have not tried it though.

Could someone explain the technicalities when this situation would occur? Is the form validation just doing a string == NULL or something? Even so I do not think NULL is same as "NULL".

Was it helpful?

Solution

I have seen database interfaces (e.g. framework libraries) that return 'null' as a string for null columns. I believe there was a flag that would turn this on or off for debugging. This flag allows developers to easily determine if the empty field was a result of a null value or an empty value. This is a bad setting, especially in production, and would explain the issues explained in the article.

The reverse processing of converting 'null' to a null value should generate an application error for a name field. I would expect this to be rather quickly resolved.

OTHER TIPS

There's a good chance that a good chunk of your confusion stems from the journalist's. The article talks about problems using entire application systems, not just databases. Completely reasonable since this is a piece of writing aimed at mass consumption, but technical details are glossed over or misunderstood by the author.

Likely a number of these issues are caused at the application layer, rather than the DB's API. Magic values are an anti-pattern which is ridiculously hard to stamp out of the industry. Very easily some programmer could have written a condition along the lines of "someone typed 'null'? They must mean there's no value, because that's what null means!" A misguided attempt at preventing SQL injection could also be responsible for the mentioned mistreatment of Null, or the Hawaiian last name which contains a single quote, which is also the standard SQL string delimiter.

An application which incorrectly transforms these values into NULL or an empty string can easily create errors if business logic or DB constraints expect something different. This naturally results in exactly the frustrating user experience described in the article.

The article itself includes a link to a Stack Overflow question that demonstrates the problem; it was in a Flex application where the code:

currentChild.appendChild("Fred");

would append an element containing the word Fred to an XML document but the code:

currentChild.appendChild("null");

would append an empty element, not an element containing the text "null".

XML, per se, does not have a problem with the text value "NULL", so including such text would be no problem. In fact, NULL has no special meaning in XML at all.

One thing I don't think I've seen mentioned: We aren't just talking about SQL.

The name might start/end in the database... but to get there normally involves MULTIPLE channels. Database, Sql, php, html, javascript... Java, C#, VB, Perl, Phython, Ruby, bash, batch, etc, etc, etc...

Each of these steps in the pipeline might involve converting data from one format into another. From Sql tables, to json, to xml, to CSV, etc...

At any point in this complicated chain, it'll only take one spot of bad programming or a fuzzy programming language (Javascript null handling for example)...

So don't limit yourself to the problem being in the database... because it can be anywhere in the "stack".

Community Wiki answer for the various links originally left as comments on this popular question

Relational databases have to accommodate missing or irrelevent values. For example, the list of customers may include a mobile phone number or a gender. What if a corporation wants to become a customer - what gender should it have?

A special indicator is used to show a value is missing. This special flag is NULL. So sloppy programming may cause the missing value indicator - NULL - to be confused with Jennifer Null's surname, perhaps interpreting her surname as missing data rather than an actual value.

It's a bit like opening a bank account but not depositing any money. The balance is zero. This does not mean you have no balance. It means you have a balance and its value is "0". But sloppy programming could misinterpret the zero as a "does not exist" and determine that you do not have an account at all.

There's plenty of poorly written software, and even frameworks, that would behave oddly with somebody named Null.

But let's not forget the human element: perhaps humans seeing the name on screens and printouts and thinking it's a mistake and deleting the person is another likely cause?

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top