How to get the data from the Cassandra every 15 minutes but return me only the information that got changed?

Question

Well, cassandra is something like a key/value store, so in order to make this happen, you need a sensible row key. You always need the row key when you submit a (column range) query. Neither bundle name nor version are a very good row key since you need to know them in advance. Do you have some kind of application categorization or other feature that you could use for partitioning?

For instance, if you had application type id (commercial, open source, private...) as another field, you could easily create a table where your clustering/column key is a timestamp. Your row key could be your application type id. Whenever there is a new version, insert the version number to application / timestamp. Then, do a range query using the timestamp.

  CREATE TABLE Bundles (
    bundle varchar,
    type varchar,
    ts timeuuid,
    version varchar,
    PRIMARY KEY (type, ts)
   );

If you run for the first time and want to know all new releases, you run:

cqlsh:test> SELECT * FROM Bundles WHERE 
    ...        type = 'OSS' and
    ...        ts < maxTimeuuid('2013-08-27 09:00:00');

(empty resultset)

Since there have been no inserts so far.

Then, you (or some other process) inserts a new release. Assume you have a couple of software categories, named "type" and type is "Frameworks" or "Open Source" or whatever fits your use case, you could insert data like this:

cqlsh:test> INSERT INTO Bundles (bundle, type, ts, version) 
 VALUES ('SomeFramwork', 'OSS', now(), '0.1.0a');

This stores a new column (under the column key value of now()) in the partition 1 (for type, our sharding key).

Fifteen mintues later, if you want to know all new releases over the last 15 minutes, you run:

    cqlsh:test> SELECT type, dateOf(ts), bundle, version FROM Bundles WHERE
     type = 'OSS' and
     ts > minTimeuuid('2013-08-27 09:00:00')
     and ts < maxTimeuuid('2013-08-27 09:15:00');

     type | dateOf(ts)               | bundle       | version
    ------+--------------------------+--------------+---------
      OSS | 2013-08-27 09:14:27+0200 | SomeFramwork |  0.1.0a

You would need a query for each type. The TimeUUD type would guarantee that inserts remain collision free.

If you are worried about rows getting too long (>2 billion), you could use buckets to limit row length.

To insert in Astyanax using cql3 queries, you can use

    keyspace.prepareQuery(CF_BUNDLES).withCql(cql).execute();

where cql is your cql query and CF_BUNDLES is an instance of ColumnFamily.

To fetch data using the cql query defined above in Astyanax you can use

    CqlResult<String, String> result = keyspace
    .prepareQuery(CF_BUNDLES).withCql(cql).execute()
    .getResult();

which enables you to iterate over the results.