How to select columns conditionally in a data frame in R

Question 1

You're looking for aggregate. Here is a forumla that returns the median age and weight by sex:

aggregate(cbind(age, weight) ~ sex, data=jalal, FUN=median)
##   sex  age weight
## 1   F 20.5  189.9
## 2   M 21.0  198.1

To get a data frame containing just the women, here is the syntax for [:

jalal[jalal$sex == 'F',]

Note the quotes around 'F'. A bare F means FALSE. That's why your second subset expression fails.

subset(jalal, subset=(sex =='F'))
##    age sex weight eye.color hair.color
## 1   23   F   93.8      blue      black
## 3   22   F  196.5     hazel       gray
## 6   16   F  152.1      blue       gray

...

In the comment, it is requested for a method for the mean values for women with blue eyes. The first approach is to filter the data frame to just blue-eyed people:

aggregate(cbind(age, weight) ~ sex, data=jalal[jalal$eye.color == 'blue',], FUN=mean)
##   sex      age   weight
## 1   F 19.66667 151.7667
## 2   M 18.00000 212.8500

But this seems hackish, after all, we're not filtering the data frame on women. So here is a formula that gives the mean age and weight, by sex and eye color. From this, you can find the mean of blue-eyed women, green-eyed men, etc.:

aggregate(cbind(age, weight) ~ sex + eye.color, data=jalal, FUN=mean)
##   sex eye.color      age   weight
## 1   M     amber 21.50000 218.5000
## 2   F      blue 19.66667 151.7667
## 3   M      blue 18.00000 212.8500
## 4   M     brown 19.33333 194.9000
## 5   F      gray 19.00000 194.6333
## 6   M      gray 23.00000 198.2000
## 7   F     green 18.50000 221.0500
## 8   M     green 21.50000 183.5500
## 9   F     hazel 21.50000 176.9500

Note rows 2 and 3 here match the results in the prior expression.

Question 2

Here's an alternative solution using the data.table package:

require(data.table)
jalal <- as.data.table(jalal)

To subset on females:

jalal[sex == "F"]

To calculate the mean, median, etc:

> jalal[sex == "F", mean(weight)]
[1] 183.52
> jalal[sex == "F", list(mean(weight), median(age))]
       V1   V2
1: 183.52 20.5

Question 3

Just so you see all the main options, here's a solution with dplyr:

library(dplyr)
jalal %.% 
  group_by(sex, eye.color) %.%
  summarise(age = mean(age), weight = median(weight))