You can use subset
:
var_a = subset(data, phone >= 100 & phone < 200)
var_b = subset(data, phone >= 200 & phone < 300)
And so on. Maybe you can improve the code to avoid hard-coding the ranges.
Question
My data is:
phone colour length weight rating
100 5 3 3 0
200 1 4
303 3 30 9
302 2 43 0 2
106 43
203 23 3 1 7
I want my data to look like this:
Variable A (sort_by_model_100):
phone colour length weight rating
100 5 3 3 0
106 43
Variable B (sort_by_model_200):
phone colour length weight rating
200 4 20 1 4
203 23 3 1 7
Variable C (sort_by_model_300):
phone colour length weight rating
303 3 30 0 9
302 2 43 0 2
My R code:
data <- read.csv(file.choose(),header=TRUE)
sort_by_model_100 <- split (data, data$phone[100:200])
sort_by_model_200 <- split (data, data$phone[200:300])
sort_by_model_300 <- split (data, data$phone[300:400])
I get this error and my code doesn't work :
Warning message:
In split.default(x = seq_len(nrow(x)), f = f, drop = drop, ...) :
data length is not a multiple of split variable
Please help.
La solution
You can use subset
:
var_a = subset(data, phone >= 100 & phone < 200)
var_b = subset(data, phone >= 200 & phone < 300)
And so on. Maybe you can improve the code to avoid hard-coding the ranges.
Autres conseils
With this data
data<-data.frame(
phone=c(100,200,303,302,106,203),
colour=c(5,NA,3,2,43,23),
length=c(3,NA,30,43,NA,3),
weight=c(3,1,NA,0,NA,1),
rating=c(0,4,9,2,NA,7)
)
I'd use cut to create a factor to indicated model class
model<-cut(data$phone, breaks=c(100,200,300,400), include.lowest=T, right=F)
Then you can use split to create a list of sub-data.frames
split(data, model)
This is likely to be easier to work with than a bunch of different data.frame variables.