Question

I'm testing performance cassandra vs mysql and I don't understand why mysql is faster than cassandra in the same ubuntu box. Is this ussual? Notice: cassandra is running in a single node.

Here my tests and times:

Times:

cc@cc-cc:~/x$ time python insert_mysql.py 

real    0m21.543s
user    0m6.688s
sys 0m2.016s
cc@cc-cc:~/x$ time python insert_cassandra.py 

real    1m15.157s
user    0m14.293s
sys 0m4.108s
cc@cc-cc:~/x$ 

Cassandra script:

#!/usr/bin/python

import cql
import random

db = cql.connect('localhost', 9160,  'mykeyspace', cql_version='3.0.0')
cursor = db.cursor()
i = 1
while(i < 200001):
    producte = random.randint( 0,100 )
    color = random.randint( 0,100 )
    preu = random.randint( 100,1000 )

    cursor.execute('''INSERT INTO pintura (id, producte, color, preu)
                      VALUES (%s, %s, %s, %s)''' % (i, producte, color, preu))
    i += 1

# Commit your changes in the database
db.commit()

# disconnect from server
db.close()

Mysql script:

#!/usr/bin/python

import MySQLdb
import random

# Open database connection
db = MySQLdb.connect("localhost","root","pass","test" )

# prepare a cursor object using cursor() method
cursor = db.cursor()

#SQL query to INSERT a record into the table prova.
i = 1

while(i < 200001):
    producte = random.randint( 0,100 )
    color = random.randint( 0,100 )
    preu = random.randint( 100,1000 )

    cursor.execute('''INSERT INTO pintura (id, producte, color, preu) 
                      VALUES (%s, %s, %s, %s)''' % (i, producte, color, preu))
    i += 1

# Commit your changes in the database
db.commit()

# disconnect from server
db.close()

DDL MySQL:

mysql> create table pintura (
    -> id int primary key,
    -> producte int,
    -> color int,
    -> preu int);
Query OK, 0 rows affected (0.24 sec)

mysql> create index i2 on pintura(producte);
Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create index i1 on pintura(color);
Query OK, 0 rows affected (0.07 sec)
Records: 0  Duplicates: 0  Warnings: 0

DDL for cassandra:

cqlsh:mykeyspace> create table pintura (
              ... id int primary key,
              ... producte int,
              ... color int,
              ... preu int);
cqlsh:mykeyspace> create index i1 on pintura (producte);
cqlsh:mykeyspace> create index i2 on pintura (color);
Était-ce utile?

La solution

Benchmarking a system is difficult and benchmarking a distributed system is extra difficult. You are most likely running into a combination of the following problems

  • Blocking single threaded requests
  • Global Interpreter Lock Issues
  • Not Saturating the Database

More Detailed Information

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top