Question

I've been handling a dataset with six items, which I need to refer to back to back over my whole analysis. I'm quite novice in R, and until now I just copy'n'pasted all my code six times and changed the variable names.

I know that in STATA you can do 'foreach'-loops using lists and just set a placeholder in your code everytime the items on the list should be inserted. I've tried to recreate something like this with R using the 'for'-loop, failed, and went on the internet. After hours of searching all I really got was "Don't produce zombie-code, use lists" :)

So I began reading tutorials and other material about lists and the lapply()-function, but I can't really get behind how to do what I want to do with it.

Here's an example of what I want to do:

t_var1 <- table ( mydata$var1, mydata$constant ) #crosstable
p_var1 <- round ( prop.table ( crosstable ) *100, 1 ) #proptable

Something like this I would like to do for variables 1 to 6.

My last try with 'for'-loop looked like this (example for crosstable):

varlist <- c("var1", "var2", "var3", "var4", "var5", "var6")
for (var in 1:6){ 
    eval(parse(text=paste("t_", varlist[[var]],sep=""))) <- table (eval(parse(text=paste("mydata$", varlist[[var]],sep=""))), mydata$constant) }

I also tried that one just with paste() (w/o eval(parse)), with get() and with 'varlist' as list(). Either way, it produces error messages. And considering that the whole internet tells me to use lists and apply-functions, I tend to believe i should.

I already understood that lists can contain elements of different classes, so you can more or less fill them with anything you like. I think I also understand how to use [] and [[]] to refer to single items of lists. Unfortunately, all the tutorials and examples I've read so far use lapply() for functions like mean or summary. So what I ask is an example of how to create multiple tables with different input- and output-names (which of course could be put into the same list) using lists and the correct apply-function to get on with my understanding.

Thank you in advance for your help! If any further information is needed, just comment and I will try to provide.

Best Regards, Leo

Was it helpful?

Solution

You can use the values of your string vector varlist in the for-loop:

attach(mydata)
crosstables<-list(NULL)
for(i in varlist)
 {crosstables[[i]]<-table(get(i),constant)}
crosstables

I hope I understood and this is helping.

Edit: A nicer approach, using lapply():

lapply(mydata[1:6],table,mydata$constant)

OTHER TIPS

I'm not sure if I understood your question correctly, but does this help?

# Creating example data
my.data <- matrix(rnorm(40,mean=1),ncol=4)
my.data2<- matrix(jitter(rbinom(20, 10, 0.7)),ncol=2)

# producing variable names
number.of.var <- dim(my.data)[2]
var.names <- c()
for (i in 1:number.of.var) {
  var.names <- c(var.names, paste("var",i, sep="_"))
}
var.names

# save data to a list
new.list <- list()
for (col in 1:number.of.var) {
  new.list[[col]] <- my.data[,col]
}
# name list elements
names(new.list) <- var.names

# add other table
new.list[5] <- list(my.data2)
names(new.list)[5] <- "last.not.least"

# watch at content of list
str(new.list)

# create new list with percentage data
new.list.percentage <- lapply(new.list, 
                              function(x) {
                                return(x*100)
                              }
)

# You can also access the elements of the list with a for-loop directly:
par(mfrow=c(2,3))
for (element.in.list in new.list) {
  plot(element.in.list)
}

I'm sure this is far from optimal and quite soon someone with more profound knowledge will answer, but maybe it helps a bit. Sorry if I did not understand correctly your question.

Edit: Added a second table so that you can see that list can have different dimensioned elements. You can access them directly (see plot loop), though for the last element it does not make much sense.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top