R: k-nearest neighbours classification

Question

Well, I cannot solve your problem, since I have no sample data. However, I can clarify the example in the documentation for you, so you can start off with an idea of what's going on.

train is the "benchmark" data, for which the classification is already known. It will be used to form the knn structure, which will allow you to make future predictions.
cl are the correct answers for the training dataset.

Here a built-in dataset iris is used to simulate "known data". The train dataset is taken so that there is an equal number of each species (s - Setosa, c - Versicolor, v - Virginica).

train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3]) 
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))

test is the dataset you are trying to classify. Given an existing (trained) knn structure, test observations are analyzed row by row, and a prediction is generated.

The same dataset is used to construct test data. Of course, we know the true classification here, but we pretend that we do not. True classification is the same as before; it cannot be used by the knn: for knn this information is not available. We store this data in order to estimate our predictions.

test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
cl.test <- cl

Finally, we are ready to proceed. Here's a vector of predictions for the test dataset. If prob=TRUE, we additionally see how "confident" the algorithm is about each case:

pr.test <- knn(train, test, cl, k = 3, prob=TRUE)
 [1] s s s s s s s s s s s s s s s s s s s s s s s s s c c v c c c c c v c c c c c c c c c c
[45] c c c c c c v c c v v v v v c v v v v c v v v v v v v v v v v
attr(,"prob")
 [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
 [9] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
[17] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
[25] 1.0000000 1.0000000 1.0000000 0.6666667 1.0000000 1.0000000 1.0000000 1.0000000
[33] 1.0000000 0.6666667 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
[41] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
[49] 1.0000000 1.0000000 1.0000000 0.6666667 0.7500000 1.0000000 1.0000000 1.0000000
[57] 1.0000000 1.0000000 0.5000000 1.0000000 1.0000000 1.0000000 1.0000000 0.6666667
[65] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.6666667
[73] 1.0000000 1.0000000 0.6666667
Levels: c s v

We now may estimate how correct our model is.

sum(pr.test==cl.test)/length(cl.test)

Which is 70 out of 75, or 93% correct.

Refer to the statistical literature for more details about how knn works. For your problem, consider cross-validation technique to tune the model.