اختبار أداء NOSQL DB

https://stackoverflow.com/questions/9029871

14-11-2019
|

سؤال

دعنا نفترض أنك حصلت على قاعدة بيانات Nosql - Redis، Cassandra، Mongodb.وتحتاج إلى التحقق من الأداء العام لقاعدة البيانات هذه - منصات مختلفة وأنظمة التشغيل وحتى لغات البرمجة التي تستخدم لاختبار.إنه غير مرتبط بتطبيق معين أو مخطط.

ما الاختبارات التي تريد رؤيتها؟هل يمكن أن تساعدني في تشكيل المتطلبات؟
- كيف تعمل قاعدة البيانات في الكتلة؟
- في الكتلة المكسورة؟
- في سحابة env؟
- كيف يمكن أن يؤدي الاستعلامات عند فتح اتصالات 10K؟
- ما الأدوات التي ستستخدمها؟
  - هل هو شيء مثل Jmeter-> HTTP Server-> قاعدة البيانات؟
  - jmeter-> tcp app-> قاعدة البيانات؟
  - آخر؟
    جميع المواد التي عثر عليها حول اختبار أداء قاعدة البيانات يشبه قاعدة بيانات الاختبار كجزء من بعض المنتجات (مخطط معين، ENV محددة). هل فكرت في اختبار أداء قاعدة البيانات عند وجود قاعدة البيانات المنتج نفسه؟
    نتطلع لك المساعدة.
    -Vova

المحلول

In NoSQL benchmarks and performance evaluations I've put together a list of the benchmarks that are correct in the sense that they clearly define the purpose of the benchmark and compare similar features (apples-to-apples comparisons); there are way too many benchmarks out there that are failing at at least one of these fundamental requirements of a benchmark. Going through those you'll be able to extract the bits that are interesting for your own benchmark plus learn what tools have been used and get some benchmarking code too.

So far the most generic NoSQL benchmark is YCSB (Yahoo Cloud Servicing Benchmark). Recently the Cubrid blog posted the results of running this benchmark against some of the most popular NoSQL solutions and that might give you an idea of how to interpret results.

نصائح أخرى

check the overall performance for this database

Unless you need to do it for fun, or you just want to get a benchmark for the sake of getting a benchmark, I would recommend to tailor a performance benchmark to the actual problem/requirements.

For example do you really need crazy fast writes? Are you ok with losing data? Do you mind spending time on configuring fail over? Do you plan to scale up or out? Are you planning for TBs of data? etc..

From the examples you gave => Redis, Cassandra and MongoDB are quite different:

Redis is mostly cache, and it is really fast, but being just a cache it would not help you much in doing medium complexity aggregation. However it is currently the best cache (my opinion) out there. "Redis + a killer DB" is an ideal combination. It also has a built in benchmark tool you can try.

Cassandra is a solid product modelled after Google Big Table (but I am sure you already know that). It scale writes well if you have lots of nodes, but if you reach TBs of data for example, it can take days to add nodes. It is also not a simplest one to get. But if you are ok to pay, there are excellent guys from Datastax who can take all the complexity away. I have a very simple Cassandra Bombardier that may help you to start off.

MongoDB is a great DB for multiple reasons: very sexy and simple query language, good documentation, huge community, etc.. Not so great in other aspects: need to spend time sharding it correctly, and then resharding it again [compare to e.g. Riak, where it is done automatically]. It is very fast (writes) if the data [not just the index] fits in RAM, it starts slow down very quickly if it does not. There is a ongoing speculation that you may lose data (from one of the Basho engineers: "I had personally spent some time finding out ways to demonstrate that MongoDB will lose writes in the face of failure"), aggregation queries may take a while given a not so large dataset. I have a Mongo Performance Playground that you may find useful.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow