Question

I'd like to begin by saying that I'm really new to CV, and there may be some obvious things I didn't think about, so don't hesitate to mention anything of that category.

I am trying to achieve scene classification, currently between indoor and outdoor images for simplicity.

My idea to achieve this is to use a gist descriptor, which creates a vector with certain parameters of the scene.

In order to obtain reliable classification, I used indoor and outdoor images, 100 samples each, used a gist descriptor, created a training matrix out of them, and used 'svmtrain' on it. Here's a pretty simple code that shows how I trained the gist vectors:

train_label= zeros(size(200,1),1);
train_label(1:100,1) = 0;         % 0 = indoor
train_label(101:200,1) = 1;        % 1 = outdoor

training_mat(1:100,:) = gist_indoor1;
training_mat(101:200,:) = gist_outdoor1;
test_mat = gist_test;

SVMStruct = svmtrain(training_mat ,train_label, 'kernel_function', 'rbf', 'rbf_sigma', 0.6);
Group       = svmclassify(SVMStruct, test_mat);

The problem is that the results are pretty bad.

I read that optimizing the constraint and gamma parameters of the 'rbf' kernell should improve the classification, but:

  1. I'm not sure how to optimize with multidimensional data vectors(the optimization example given in Mathworks site is in 2D while mine is 512), any suggestion how to begin?

  2. I might be completely in the wrong direction, please indicate if it is so.

Edit: Thanks Darkmoor! I'll try calibrating using this toolbox, and maybe try to improve my feature extraction. Hopefully when I have a working classification, I'll post it here.

Edit 2: Forgot to update, by obtaining gist descriptors of indoor and urban outdoor images from the SUN database, and training with optimized parameters by using the libsvm toolbox, I managed to achieve a classification rate of 95% when testing the model on pictures from my apartment and the street outside.

I did the same with urban outdoor scenes and natural scenes from the database, and achieved similar accuracy when testing on various scenes from my country.

The code I used to create the data matrices is taken from here, with very minor modifications:

% GIST Parameters:
clear param
param.imageSize = [256 256]; % set a normalized image size
param.orientationsPerScale = [8 8 8 8]; % number of orientations per scale (from HF to LF)
param.numberBlocks = 4;
param.fc_prefilt = 4;

%Obtain images from folders
sdirectory = 'C:\Documents and Settings\yotam\My Documents\Scene_Recognition\test_set\indoor&outdoor_test';
jpegfiles = dir([sdirectory '/*.jpg']);

% Pre-allocate gist:
Nfeatures = sum(param.orientationsPerScale)*param.numberBlocks^2;
gist = zeros([length(jpegfiles) Nfeatures]); 

% Load first image and compute gist:
filename = [sdirectory '/' jpegfiles(1).name];
img = imresize(imread(filename),param.imageSize);
[gist(1, :), param] = LMgist(img, '', param); % first call
% Loop:
for i = 2:length(jpegfiles)
   filename = [sdirectory '/' jpegfiles(i).name];
   img = imresize(imread(filename),param.imageSize);
   gist(i, :) = LMgist(img, '', param); % the next calls will be faster
end
Was it helpful?

Solution

  1. I suggest you to use libsvm it is very efficient. There is relevant post for cross validation of libsvm. The same logic can be used for the relevant Matlab lib you mention.

  2. Your logic is correct. Extract features and try to classify them. In any case, do not expect that the calibration of your classifier will return huge differences. The key idea is the feature extraction for huge differences in your results, in combination of course with your classifier calibration ;).

Good luck.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top