Bad Request: No indexed columns present in by-columns clause with Equal operator : CQL error?

StackOverflow https://stackoverflow.com/questions/19756368

  •  03-07-2022
  •  | 
  •  

I have below table in CQL-

create table test (
    employee_id text,
    employee_name text,
    value text,
    last_modified_date timeuuid,
    primary key (employee_id)
   );

I inserted couple of records in the above table like this which I will be inserting in our actual use case scenario-

insert into test (employee_id, employee_name, value, last_modified_date) values ('1', 'e27',  'some_value', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('2', 'e27',  'some_new_value', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('3', 'e27',  'some_again_value', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('4', 'e28',  'some_values', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('5', 'e28',  'some_new_values', now());

Now I was doing select query for - give me all the employee_id for employee_name e27.

select employee_id from test where employee_name = 'e27';

And this is the error I am getting -

Bad Request: No indexed columns present in by-columns clause with Equal operator
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.

Is there anything wrong I am doing here?

My use cases are in general -

  1. Give me everything for any of the employee_name?
  2. Give me everything for what has changed in last 5 minutes?
  3. Give me the latest employee_id and value for any of the employee_name?
  4. Give me all the employee_id for any of the employee_name?

I am running Cassandra 1.2.11

有帮助吗?

解决方案

The general rule is simple: "you can only query by columns that are part of the key". As an explanation all other queries would require a complete scan of the tables which might mean a lot of data sifting.

There are things that can modify this rule:

  1. use secondary indexes for columns with low cardinality (more details here)
  2. define multi-column keys (e.g. PRIMARY KEY (col1, col2); which would allow queries like col1 = value1 and col1 = value1 and col2 COND)
  3. use ALLOW FILTERING in queries. This will result in a warning as Cassandra will have to sift through a lot of data and there will be no performance guarantees. For more details see details of ALLOW FILTERING in CQL and this SO thread

其他提示

Cassandra take a little getting used to :) Some of us have been spoiled by some of the extra stuff RDBMS does for you that you do not get for free from noSql.

If you think back on a regular RDBMS table, if you SELECT on a column that has no index, the DB must do a full-table scan to find all the matches you seek. This is a no-no in Cassandra, and it will complain if you try to do this. Imagine if you found 10^32 matches to this query? It is not a reasonable ask.

In your table, you have coded *PRIMARY KEY(employee_id);* this is the row's primary and unique identifying key. You can now SELECT * from TEST where employee_id='123'; this is perfectly reasonable and Cassandra will happily return the result.

However, your SELECT from TEST WHERE employee_name = 'e27'; tells Cassandra to go and read EVERY record until it finds a match on 'e27'. With no index to rely on, it politely asks you to 'forget it'.

If you want to filter on a column, make sure you have an index on that column so that Cassandra can performs the filtering you need.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top