I am assuming you are using some textual model like the bag of words model.
From my experiments, you usually don't get good results when changing from one domain to another (even if the train data set and the test are all products, but of different categories!).
Think of it logically, an oven that gets hot quickly usually indicate a good product. Is it also the same for laptops?
When I experimented with it a few years ago I used amazon comments as both train set and also to test my algorithms.
The comments are short and informative and were enough to get ~80% accuracy. The 'ground' truth was the stars system, where 1-2 stars were 'negative', 3 stars - 'neutral', and 4-5 stars 'positive'.
I used a pearl script from esuli.it to crawl amazon's comments.