I think every test-oriented developer faces this dilemma when working with Riak. As Christian has mentioned, there is no concept of rollbacks in Riak. And there is no single "truncate database" command that you can issue.
You have 3 approaches available to you:
Clear all the data on your test cluster. This essentially means issuing shell commands (assuming your test server is running on the same machine as your test suite). If you're using a Memory backend, this means issuing
riak restart
between each test. For other backends, you'd have to stop the node and delete the whole data directory and start it again:riak stop && rm -rf <...>/data/* && riak start
. PROS: Wipes the cluster data clean between each test. CONS: This is slow (when you take into account shutdown and restart times), and issuing shell commands from your test suite is often awkward. (Sidenote: while it may be slow to do between each test, you can certainly feel free to clear the data directory before each run of the whole test suite.)Loop through all the buckets and keys and delete them, on your test cluster, as you've suggested above. PROS: Simple to understand and implement. CONS: Also slow (to run between each test).
Have each test clean up after itself. So, if your test creates a User object, make sure to issue a DELETE command for that object at the end of the test. Optionally, test that a user doesn't exist initially, before creating one. (To make doubly sure that the previous test cleaned up). PROS: Simple to understand and implement. Fast (definitely faster than looping through all the buckets and keys between each test). CONS: Easy for developers to forget to clean up after each insert.
After having debated these approaches, I've settled on using #3 (combined, frequently, with wiping the test server data directory before each test suite run).
Some thoughts on mitigating the CONS of the 'each test cleans up after itself, manually' approach:
Use a testing framework that runs tests in random order. Many frameworks, like Ruby's Minitest, do this out of the box. This often helps catch tests that depend on other tests conveniently forgetting to clean up
Periodically examine your test cluster (via a list buckets) after the tests run, to make sure there's nothing left. In fact, you can do this programmatically at the end of each test suite (something as simple as doing a bucket list and making sure it's empty).
(This is good testing practice in general, but especially relevant with Riak) Write less tests that hit the database. Maintain strict division between Unit Tests (that test object state and behavior without hitting the db) and Integration or Functional Tests (that do hit the db). Make sure there's a lot more of the former than the latter. To put it in other words -- you don't have to test that the database works, with each unit test. Trust it (though obviously, verify, during the integration tests).
For example, if you're using Riak with Ruby on Rails, and you're testing your models, don't call test_user.save!
to verify that a user instance is valid (like I once did, when first getting started). You can simply test for test_user.valid?
, and understand that the call to save will work (or fail) accordingly, during actual use. Consider using Mockist-style testing, which verifies whether or not a save!
function was actually invoked, instead of actually saving to the db and then reading back. And so on.