SURF/SIFT are spatial local features. If the dataset is large enough the result should be good. Even for different vehicles only specific structures of that vehicle are available in a scene image.
However false positives might creep in if similar non-vehicle structure is present. (E.g distorted image of a small rectangular house with fence). So, some global feature like road detection might increase accuracy.
So, i think sirf/surf features of vehicles with single class SVM should help if the false positives are not present in the image of your application.