Question

From the past few weeks we were trying to evaluate different different Cassandra clients so now it looks like we will go forward with Netflix/Astyanax client.

We are trying to optimize Cassandra database mainly for read performance. Currently, I am creating Astyanax connection like this-

/**
 * Creating Cassandra connection using Astyanax client
 *
 */
private CassandraAstyanaxConnection() {

    context = new AstyanaxContext.Builder()
    .forCluster(ModelConstants.CLUSTER)
    .forKeyspace(ModelConstants.KEYSPACE)
    .withAstyanaxConfiguration(new AstyanaxConfigurationImpl()      
        .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
    )
    .withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
        .setPort(9160)
        .setMaxConnsPerHost(40)
        .setSeeds("node1:9160,node2:9160,node3:9160,node4:9160")
    )
    .withAstyanaxConfiguration(new AstyanaxConfigurationImpl()      
        .setCqlVersion("3.0.0")
        .setTargetCassandraVersion("1.2"))
    .withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
    .buildKeyspace(ThriftFamilyFactory.getInstance());

    context.start();
    keyspace = context.getEntity();

    emp_cf = ColumnFamily.newColumnFamily(
        ModelConstants.COLUMN_FAMILY, 
        StringSerializer.get(), 
        StringSerializer.get());
}

Problem Statement:-

So by default, I believe Astyanax client will use ConnectionPoolType as ROUND_ROBIN.

Now I am trying to understand from the read performance point of view which one will be better from below options?

TOKEN_AWARE or ROUND_ROBIN or BAG

And what is the difference between those three? And how we decide that we should use one of them from above three?

Some background about our cluster. We are going to have single cross colo cluster with 24 nodes. Meaning 12 nodes in SLC colo and 12 nodes in PHX colo.

And we are going to use NetworkTopologyStrategy with replication factor of 4, meaning 2 in each colo. We will be using LeveledCompactionStrategy.

Any explanation on my above question will be of great help. There will be lot of people who will be using Astyanax client in the production environment. Any feedback will be of great help.

Thanks for the help.

Update:-

Still looking for an answer which can explain me what is the main difference between those three with an example so that I can understand better. I know what does those means in general but not able to understand form an example point of view how it will works out.

Was it helpful?

Solution

ROUND_ROBIN

In this type of ConnectionPoolType, your connection will be instantiated in round robin type depending on the set of hosts.

TOKEN_RANGE

It is somewhat similar to ROUND_ROBIN type, sets up a basic token aware pool which will round robin all hosts within a token range

BAG

Don't know much about this type, but i guess it will something like, your connection will be instantiated from the BAG of hosts randomly, independent of token range or round robin pattern.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top