DataStax Java Driver loop repeating rows

https://stackoverflow.com/questions/22393533

14-06-2023
|

Question

I'm developing a new product using cassandra as DB. Right now installed on a single ubuntu 13.10 development laptop core i7. I have a column family and a query. This query, executed in cqlsh give 33267 rows. Executed on my java program, using the datastax java driver 2.0, some executions give the correct rows, others got into an infinite loop repeating again and again the same rows:

while (!rs.isExhausted()) {
  Row row = rs.one();
  long hora = row.getDate(1).getTime();
  String clave = row.getString(0);
  List<Long> data = row.getList(2, Long.class);
  ordenados.put(hora, new Object[]{clave, data.get(0) / 100000000.0, data.get(1)});
  contador2 +=1;
  if (Math.floor(contador2/1000.0) == contador2/1000.0) {
    System.out.println("sitio "+ contador2+ " "+clave+ " "+hora);
  }
}

When profiling the app, I see lock contention betweeen new I/O workers threads, 98% time is spend on sun.nio.ch.EPollArrayWrapper.poll method. Someone has experienced this issue and know a solution? Someone can me direct to a link to download the cassandra-driver-core-2.0.0.src.jar so I can debug the error with the sources and report to datastax? This is an exciting technology, but is the first time in my career a production DB give me so unreliable behaviour. By the way: The original query had an order by that I removed. With the order by, I got this exception: Exception in thread "main" com.datastax.driver.core.exceptions.InvalidQueryException: Cannot page queries with both ORDER BY and a IN restriction on the partition key; you must either remove the ORDER BY or the IN and sort client side, or disable paging for this query When yesterday worked on similar queries and on cqlsh it works without problem with the order by added. I just talk about this problem because maybe both are related. Regards

La solution

You can get the source from githib datastax/java-driver. It doesn't look like the source is included in either the maven or tarball downloads.

I think you are encountering CASSANDRA-6722 when you used IN and ORDER BY in your query. The java-driver automatically does paging with a default fetch size of 5000. You can disable automatic paging with Statement.setFetchSize(Integer.MAX_VALUE). There is more info about automatic paging in this blog post.

What version of Cassandra is you application connecting to? If you could share more about your table definition and query maybe it will be possible to reproduce the repeating rows issue.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow