Is it good to create virtual machines(nodes) to get better performance on cassandra?

Question 1

"I know it should be tested, but I want to know basically is it reasonable or not?"

That will answer most of your assumptions.

The basic advantage of using cassandra is availability. If you are planning to have just two dedicated servers, then there is a question mark on your availability of data. Considering the worst case, you always have just two replicas of data at any point of time.

My take is to go for a nicely split dedicated set up in small chunks. Everything boils down to your use case.

1.If you have a lot of data flowing in and if you consider data as king(in such a case , you need more replicas to handle in case of failures), i would prefer a high end distributed set up.

2.If you are looking for the other way around(data is not your forte and your data is just another part of your set up), you shall just go for the set up what you have mentioned.

3.If you have a cost constraint and if you are a start up with a minimal data that is important to you, set up in two nodes what you have with replication of 2(Simple Strategy ) and replication of 1(Network Topology)

Question 2

Adding new nodes always adds some overhead, they need to communicate within each other and sync their data. Therefore, the more nodes you add, you'd expect the overhead to increase with adding each node. You'd add more nodes only in a situation where the existing number of nodes can't handle the input/output demands. Since in the situation you are describing , you'd be actually writing on the same disk, you'd actually effectively be slowing down your cluster by adding more nodes.

Imagine the situation: you have a server, it receives some data and then writes it on disk. Now imagine the same situation, where the disk is shared between two servers and they both write the same information at the almost same time on the same disk. The two servers also use cpu cycles to communicate between each other that the data has been written so they can sync up. I think this is a sufficient enough information to describe to you why what you are thinking is not a good idea if you can avoid it.

EDIT: Of course, this is the information only in layman's terms, C* has a very nice architecture in which data is actually spread according to an algorithm to a certain range of nodes (not all of them) and when you are querying for a specific key, the algorithm actually can tell you where to find the data. With that said, when you add and remove nodes, the new nodes have to communicate with the cluster that they want to share 'the burden' and as a result, a recalculation of what is known as a 'token-ring' takes place at the end of which data may be shuffled around so it is accessible in a predictable way.

You can take a look at this:

http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes-2

But in general, there is indeed some overhead when nodes communicate with each other, but the number of the nodes would almost never negatively or positively impact your query speed dramatically if you are querying for a single key.