Question

imagine I have a simple CQL table

CREATE TABLE test (
k int PRIMARY KEY,
v1 text,
v2 int,
v3 float
)

There are many cases where one would want to make use of the schema-less essence of Cassandra and only set some of the values and do, for example, a

INSERT into test (k, v1) VALUES (1, 'something');

When writing an application to write to such a CQL table in a Cassandra cluster, the need to do this using prepared statements immediately arises, for performance reasons.

This is handled in different ways by different drivers. Java driver for example has introduced (with the help of a modification in CQL binary protocol), the chance of using named bound variables. Very practical: CASSANDRA-6033

What I am wondering is what is the correct way, from a binary protocol point of view, to provide values only for a subset of bound variables in a prepared query?

Values in fact are provided to a prepared query by building a values list as described in

4.1.4. QUERY 
[...]
Values. In that case, a [short] <n> followed by <n> [bytes]
values are provided. Those value are used for bound variables in
the query.

Please note the definition of [bytes]

[bytes]        A [int] n, followed by n bytes if n >= 0. If n < 0,
               no byte should follow and the value represented is `null`.

From this description I get the following:

  1. "Values" in QUERY offers no ways to provide a value for a specific column. It is just an ordered list of values. I guess the [short] must correspond to the exact number of bound variables in a prepared query?
  2. All values, no matter what types they are, are represented as [bytes]. If that is true, any interpretation of the [bytes] value is left to the server (conversion to int, short, text,...)?

Assuming I got this all right, I wonder if a 'null' [bytes] value can be used to just 'skip' a bound variable and not assign a value for it.

I tried this and patched the cpp driver (which is what I am interested in). Queries get executed but when I perform a SELECT from clqsh, I don't see the 'null' string representation for empty fields, so I wonder if that is a hack that for some reasons is not just crashing or the intended way to do this.

I am sorry but I really don't think I can just download the java driver and see how named bound variables are implemented ! :(

---------- EDIT - SOLVED ----------

My assumptions were right and now support to skip a field in a prepared query has been added to cpp driver (see here ) by using a null [bytes value].

Était-ce utile?

La solution 2

Implementation of what I was trying to achieve has been done (see here ) based on the principle I described.

Autres conseils

What I am wondering is what is the correct way, from a binary protocol point of view, to provide values only for a subset of bound variables in a prepared query?

You need to prepare a query that only inserts/updates the subset of columns that you're interested in.

"Values" in QUERY offers no ways to provide a value for a specific column. It is just an ordered list of values. I guess the [short] must correspond to the exact number of bound variables in a prepared query?

That's correct. The ordering is determined by the column metadata that Cassandra returns when you prepare a query.

All values, no matter what types they are, are represented as [bytes]. If that is true, any interpretation of the [bytes] value is left to the server (conversion to int, short, text,...)?

That's also correct. The driver will use the returned column metadata to determine how to convert native values (strings, UUIDS, ints, etc) to a binary (bytes) format. Cassandra does the inverse of this operation server-side.

Assuming I got this all right, I wonder if a 'null' [bytes] value can be used to just 'skip' a bound variable and not assign a value for it.

A null column insertion is interpreted as a deletion.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top