Question

We are currently doing some stress tests with ab tool. The single inserts are doing fine in cassandra. However, when it comes to batch inserts, I'm currently dealing with java out of memory error: Java Heap Space.

I have a virtual box machine with Ubuntu server 13.04 installed in it with 2G of memory

I don't know much about internal configuration in cassandra.

I'm just making a batch insert with size 100(100 insert in a BATCH).

After the I see this error, I have no longer cqlsh access, no nodetool access for almost 1 hour.

How can I fix this error in heavy loads ?

NOTE : It doesn't happen on single inserts with a HTTP POST requests.

NOTE : In my column family, I have a key with TimeUUIDType and the column values are int s and varchar s

UPDATE : Test results show that I didn't have anything wrong before 6000 requests. However, when it comes to 7000, the php code throws the following;

Error connecting to 127.0.0.1: Thrift\Exception\TTransportException: TSocket: timed out reading 4 bytes from 127.0.0.1:9160

Morever, cassandra logs the following in heavy loads;

WARN [ScheduledTasks:1] 2013-06-28 03:43:07,931 GCInspector.java (line 142) 
Heap is 0.9231763795560355 full.  You may need to reduce memtable and/or cache sizes.
Cassandra will now flush up to the two largest memtables to free up memory.  Adjust
flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to 
do this automatically
Was it helpful?

Solution

The batch doesn't sound like a large enough dataset to cause the memory problem, so this sounds like a problem with the JVM on the virtual machine. How much memory have you allocated to it?

You can check by starting JConsole (just type jconsole in the terminal / prompt) and viewing the 'Memory' tab, specifically the value under Max:

JVM Memory Stats


You can also get some solid details about what caused the crash thanks to the XX:+HeapDumpOnOutOfMemoryError parameter included in C*'s startup script, its basically a log file storing the stacktrace that caused the memory problem.

Typically the heap size is calculated automatically by the calculate_heap_sizes() function in cassandra-env.sh. You can however override the number that function generated by setting MAX_HEAP_SIZE to a different value. The same variable is used on lines 174 & 175 in cassandra-env.sh JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}" for setting the min and max heap size.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top