To get the answer you are looking for, use:
mdf[order(as.numeric(as.character(mdf$Rank))),]
The reason your original code doesn't work is that your Rank
variable is a factor, so it will be sorted by the levels of the factor. For example, if you had a data frame such that:
DF
# x
# 1 2
# 2 22
# 3 11
# 4 1
and order the data
DF[order(DF$x),]
and you look at the levels:
levels(DF$x)
# [1] "1" "2" "11" "22"
We can reorder the levels such that
levels(DF$x) <- relevel(DF$x, ref = '11')
Now,
levels(DF$x)
# [1] "2" "22" "11" "1"
So ordering with the new factor levels we get different results:
DF[order(DF$x),]
To answer your question of why as.numeric
doesn't work, it's because each factor level has an associated integer, which you get with as.numeric
. If you want the number that is the factor label, you must first convert to a character and then convert to numeric, thus requiring as.numeric(as.character(x))
For example, calling as.numeric(DF$x)
gives the integer values for each level, but not the actual label for each level:
# [1] 2 4 3 1
One way to avoid this in the future if you are loading your data frame from a .csv file is to use read.csv(..., stringsAsFactors=FALSE)
, or also I like the fread
function in data.table
which uses safer default types.