I am hoping someone can explain me, if I'm on the right way. I'm trying to learn something about image retrieval and SVM but it's just a little bit confusing. I will ask my questions by posting the source code.
First I have a dataset of cats. For every "cat" picture I get the descriptor by using the sift algorithm (vlfeat). I stick all the descriptors (from every picture) together into one list and find out the clusters (of all descriptors) by using k-means (I choose k=3
) trying it out and plotting the result.
Question 1: Is there a "terminal-way" to see if I choose a good k? Because plotting a 128 dimension descriptor set of 50 cat pictures takes a long time.
Question 2: I'm doing list.append(hstack((loc,des)))
with the locations and the descriptors. Is this the right way or should I only take the descriptors?
def get_features(datas):
list = []
for data in datas:
loc,des = vlfeat_module.vlf_create_desc(data,'tmp.sift')
list.append(hstack((loc,des)))
desc = numpy.vstack(list)
center,_ = kmeans(desc, 3)
return center
After getting the centers I make a *.sparse file of the 3 x 128 dimension descriptors that looks like this:
1 1:333.756498151 2:241.935029943...
1 1:806.715774779 2:1134.68287451...
....
After this procedure with the cat pictures I repeat this with "none-cat-pictures" and get a *.sparse file that looks like this:
0 1:101.905620535 2:250.9213760...
0 1:223.619957204 2:509.303625427...
...
I took both *.sparse files together and started training with SVM (I think that I started ^^)
X_train, y_train = load_svmlight_file("./svm_files/cats_nonecats.sparse")
clf = svm.NuSVC(gamma=0.07,verbose=True)
clf.fit(X_train,y_train)
pred = clf.predict(X_train)
accuracy_score(y_train, pred)
I get this result:
[LibSVM]*
optimization finished, #iter = 4
C = 2.000000
obj = 5.000000, rho = 0.000000
nSV = 10, nBSV = 0
Total nSV = 10
NuSVC(cache_size=200, coef0=0.0, degree=3, gamma=0.07, kernel=rbf,
max_iter=-1, nu=0.5, probability=False, shrinking=True, tol=0.001,
verbose=True)
1.0
I don't think that this is right, so maybe someone could explain me my mistakes.
Next question: is this the "training?" or did I repeat something e.g 10 times? Is it possible for the classifier now to recognise cats?
Thank you for some answers.
Greetings,
Linda
EDIT
well i'll try to explain what i did right now. I hope it's now correct.
1. split my data into test and training data
2. get all destrictors from training / test data
3. create centers with (k-means) from training data
4. get all histogram-vectors from descriptors of the training data
5. create a sparse file from the histogramm vector
6. feed this sparse file to the svm
some errors?
EDIT part II:
I've updated the number of pictures...but I have got some more questions. What did you mean with "np.bincount + divide by sum" ? If I have got a Histogram like that [120, 0 , 300, 80] then i have to divide this values by the sum of the descriptors for one picture? maybe so? [120/500, 0/500, 300/500. 80/500] ? And is there a good way to compute the k of k-means? because 100 between 500 maybe is the right k for cats but what if i wanted to learn my classifier to recognise dogs? The k will be another?!
Thank you