Sorting continuous variables by dummy categories in R

https://stackoverflow.com//questions/22047412

21-12-2019
|

Question

I have an income variable which contains income information for the categorical variable races where races=1 is White and races=2 is Black. I am trying to figure out a way of seeing how many blacks in my data set make over 316000. I know how to do this in Stata by simply

 sort races income
 by races: count if income>316000

However, I am struggling in R. I tried

x<-table(income,races)
x[(x>316000) if races==2]

but get an error message.

Solution

In R you rarely (perhaps never) need to sort your data. Consider something like:

table(races[income > 316000])

OTHER TIPS

Try this

x[x$income > 316000 & x$races == 2,]

Other possibilities, assuming your data frame is named df:

df <- data.frame(income = c(316000, 316000, 316000, 316000, 316001, 316001),
             race = c(1, 1, 1, 2, 2, 2))
df
#   income race
# 1 316000    1
# 2 316000    1
# 3 316000    1
# 4 316000    2
# 5 316001    2
# 6 316001    2

with(df, sum(income[race == 2] > 316000))
# [1] 2

# or 
with(df, sum(income > 316000 & race == 2))
# [1] 2

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow