Question

How can I write a query to find all records in a table that have a null/empty field? I tried tried the query below, but it doesn't return anything.

SELECT * FROM book WHERE author = 'null';
Was it helpful?

Solution

null fields don't exist in Cassandra unless you add them yourself.

You might be thinking of the CQL data model, which hides certain implementation details in order to have a more understandable data model. Cassandra is sparse, which means that only data that is used is actually stored. You can visualize this by adding in some test data to Cassandra through CQL.

cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 } ;
cqlsh> use test ;
cqlsh:test> CREATE TABLE foo (name text, age int, pet text, primary key (name)) ;
cqlsh:test> insert into foo (name, age, pet) values ('yves', 81, 'german shepherd') ;
cqlsh:test> insert into foo (name, pet) values ('coco', 'ferret') ;

cqlsh:test> SELECT * FROM foo ;

name | age  | pet
-----+-----+------------------
coco | null | ferret
yves |  81  | german shepherd

So even it appears that there is a null value, the actual value is nonexistent -- CQL is showing you a null because this makes more sense, intuitively.

If you take a look at the table from the Thrift side, you can see that the table contains no such value for coco's age.

$ bin/cassandra-cli
[default@unknown] use test;
[default@test] list foo;
RowKey: coco
=> (name=, value=, timestamp=1389137986090000)
=> (name=age, value=00000083, timestamp=1389137986090000)
-------------------
RowKey: yves
=> (name=, value=, timestamp=1389137973402000)
=> (name=age, value=00000051, timestamp=1389137973402000)
=> (name=pet, value=6765726d616e207368657068657264, timestamp=1389137973402000)

Here, you can clearly see that yves has two columns: age and pet, while coco only has one: age.

OTHER TIPS

As far as I know you cannot do this with NULL.

As an alternative, you could use a different empty value, for example the empty string: ''

In that case you could select all books with an empty author like this (assuming the author column is appropriately indexed):

SELECT * FROM book WHERE author = '';

If your_column_name in your_table is a text data type then following should work,

SELECT * FROM your_table WHERE your_column_name >= '' ALLOW FILTERING;

You can try language hacks depending on your usecase; eg: if you have a column: column_a, which holds only positive integer values.

for this usecase to filter results by Null values of this column, you can apply the condition as: where column_a <0

This will work if you are using Solr over Cassandra but not sure about direct Cassandra query.

SELECT * FROM BOOK WHERE solr_query = ' -author : [* TO *] '
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top