Sort data frame column by factor

Question 1

order takes multiple arguments, and it does just what you want:

with(score, score[order(sex, y, x),])
##         x        y sex
## 3   SUSAN 6.636370   F
## 5    EMMA 6.873445   F
## 9  VIOLET 8.539329   F
## 6 LEONARD 6.082038   M
## 2     TOM 7.812380   M
## 8    MATT 8.248374   M
## 4   LARRY 8.424665   M
## 7     TIM 8.754023   M
## 1    MARK 8.956372   M

Question 2

Here is a summary of all methods mentioned in other answers/comments (to serve future searchers). I've added a data.table way of sorting.

# Base R
do.call(rbind, by(score, score$sex, function(x) x[order(x$y),]))
with(score, score[order(sex, y, x),])
score[order(score$sex,score$x),]

# Using plyr
arrange(score, sex,y)
ddply(score, c('sex', 'y'))

# Using `data.table`
library("data.table")
score_dt <- setDT(score)

# setting a key works sorts the data.table
setkey(score_dt,sex,x)
print(score_dt)

Here is Another question that deals with the same

Question 3

I think there must be some function like it to apply on data frames and get data frames as return

Yes there is:

library(plyr)

ddply(score, c('y', 'sex'))

Question 4

It sounds to me like you're trying to order by score within the males and females and return a combined data frame of sorted males and sorted females.

You are right that by(score, score$sex, function(x) x[order(x$y),]) returns a list of sorted data frames, one for male and one for female. You can use do.call with the rbind function to combine these data frames into a single final data frame:

do.call(rbind, by(score, score$sex, function(x) x[order(x$y),]))
#           x         y sex
# F.5    EMMA  7.526866   F
# F.9  VIOLET  8.182407   F
# F.3   SUSAN  9.677511   F
# M.4   LARRY  6.929395   M
# M.8    MATT  7.970015   M
# M.7     TIM  8.297137   M
# M.6 LEONARD  8.845588   M
# M.2     TOM  9.035948   M
# M.1    MARK 10.082314   M

Question 5

I believe that the person asked how to sort it by the orders in the case of say 20.

I know how to do that if I have 2 or 3 factors. But what if I had serious levels of factors, say 20, should I write a for loop?

I have one where there are 9 orders with various counts.

stage_name               count
  <ord>                    <int>
1 Closed Lost                957
2 Closed Won                1413
3 Evaluation                1773
4 Meeting Scheduled         4104
5 Nurture                   1222
6 Opportunity Disqualified   805
7 Order Submitted           1673
8 Qualifying                5138
9 Quoted                    4976

In this case you can see that it is displayed using alphabetical order of stage_name, but stage_name is actually an ordered factor that has a very different order.

This code orders the factor is a much different order:

# Make categoricals ----
check_stage$stage_name = ordered(check_stage$stage_name, levels=c(
    'Opportunity Disqualified', 
    'Qualifying',
    'Evaluation',
    'Meeting Scheduled',
    'Quoted',
    'Order Submitted',
    'Closed Won',
    'Closed Lost',
    'Nurture'))

Now we can just apply the factor as the method of ordering this is a dplyr function, but you might need forcats too. I have both libraries installed:

check_stage <- check_stage %>% 
  arrange(factor(stage_name))

This now gives the output in the factor order as desired:

    check_stage

# A tibble: 9 × 2
  stage_name               count
  <ord>                    <int>
1 Opportunity Disqualified   805
2 Qualifying                5138
3 Evaluation                1773
4 Meeting Scheduled         4104
5 Quoted                    4976
6 Order Submitted           1673
7 Closed Won                1413
8 Closed Lost                957
9 Nurture                   1222