R: how do I output the factor level from a for loop rather than the index?

https://stackoverflow.com/questions/8774515

14-04-2021
|

Question

I have a data frame that I am running a Monte Carlo simulation on, using for loops, to generate a simulated distribution. As I am testing the simulation code, I am just accessing the first observation in the data frame:

Male.MC <-c()
for (j in 1:100){
    for (i in 1:1)  {
        # u2 <- Male.DistF$Male.stddev_u2[i] * rnorm(1, mean = 0, sd = 1)
        u2 <- Male.DistF$RndmEffct[i] * rnorm(1, mean = 0, sd = 1)
        mc_bca <- Male.DistF$lmefits[i] + u2
        temp <- Lambda.Value*mc_bca+1
        ginv_a <- temp^(1/Lambda.Value)
        d2ginv_a <- max(0,(1-Lambda.Value)*temp^(1/Lambda.Value-2))
        mc_amount <- ginv_a + d2ginv_a * Male.DistF$Male.var[i]^2 / 2
        z <- c(RespondentID <- Male.DistF$RespondentID[i], 
                   Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i], 
        Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount) 
        Male.MC <- as.data.frame(rbind(Male.MC,z))
    }
}
colnames(Male.MC) <- c("RespondentID", "AgeFactor", 
                       "SampleWeight", "VarByAge", 
                       "lmefits", "u2", "mc_amount")

The code works beautifully except that Male.DistF$RespondentID is a factor and I don't get the factor level output, but instead get the factor index, in this case I get 1 as the RespondentIDs are in ascending order in the Male.DistF data frame. I have the same problem with AgeFactor, where I get the index rather than the factor level.

head(Male.MC)
  RespondentID AgeFactor SampleWeight  VarByAge  lmefits         u2 mc_amount
z            1         3    0.4952835 0.4189871 15.22634  0.2334501 11582.681
2            1         3    0.4952835 0.4189871 15.22634  0.3205741 11984.220
3            1         3    0.4952835 0.4189871 15.22634 -0.5674165  8420.678
4            1         3    0.4952835 0.4189871 15.22634 -0.5426489  8505.421
5            1         3    0.4952835 0.4189871 15.22634  0.4878695 12790.565
6            1         3    0.4952835 0.4189871 15.22634  0.1556925 11234.583

How can I make the `Male.MC1 data frame contain the factor levels for those two variables? I have tried:

z <- c(RespondentID <- as.character(Male.DistF$RespondentID[i]), 
       Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i], 
       Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)

and

z <- c((as.character(Male.DistF$RespondentID[i])), 
       Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i], 
       Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)

to fix the RespondentID output, but I am doing something wrong with that syntax and it's trying to convert all the output to factors:

There were 50 or more warnings (use warnings() to see the first 50)
str(Male.MC)
'data.frame':   100 obs. of  7 variables:
$ RespondentID: Factor w/ 1 level "100020": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ AgeFactor   : Factor w/ 1 level "3": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ SampleWeight: Factor w/ 1 level "0.495283471": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ VarByAge    : Factor w/ 1 level "0.418987052181831": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ lmefits     : Factor w/ 1 level "15.2263403968895": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ u2          : Factor w/ 1 level "-0.100954008424162": 1 NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ mc_amount   : Factor w/ 1 level "10151.4582133747": 1 NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr  "z" "" "" "" ...

For testing, here is the first couple of rows of the input data frame Male.DistF:

     AgeFactor RespondentID SampleWeight IntakeAmt   RndmEffct NutrientID Gender Age BodyWeight  IntakeDay BoxCoxXY  lmefits      lmeres   TotWts   GrpWts NumSubjects TotSubjects  Male.var
1725     9to13       100020    0.4952835 12145.852  0.30288536        267      1  12       51.6 Day1Intake 15.61196 15.22634  0.27138449 2291.827 763.0604         525        2249 0.4189871
203     14to18       100419    0.3632839  9591.953  0.02703093        267      1  14       46.3 Day1Intake 15.01444 15.31373 -0.18039624 2291.827 472.3106         561        2249 0.3365423

Lambda.Value is 0.1. The information on Male.DistF is:

str(Male.DistF)
'data.frame':   2249 obs. of  18 variables:
$ AgeFactor   : Ord.factor w/ 4 levels "1to3"<"4to8"<..: 3 4 3 4 2 2 3 1 1 3 ...
$ RespondentID: Factor w/ 2249 levels "100020","100419",..: 1 2 3 4 5 6 7 8 9 10 ...
$ SampleWeight: num  0.495 0.363 0.495 1.326 2.12 ...
$ IntakeAmt   : num  12146 9592 7839 11113 7150 ...
$ RndmEffct   : num  0.3029 0.027 0.0772 0.4667 -0.1593 ...
$ NutrientID  : int  267 267 267 267 267 267 267 267 267 267 ...
$ Gender      : int  1 1 1 1 1 1 1 1 1 1 ...
$ Age         : int  12 14 11 15 6 5 10 2 2 9 ...
$ BodyWeight  : num  51.6 46.3 46.1 63.2 28.4 18 38.2 14.4 14.6 32.1 ...
$ IntakeDay   : Factor w/ 2 levels "Day1Intake","Day2Intake": 1 1 1 1 1 1 1 1 1 1 ...
$ BoxCoxXY    : num  15.6 15 14.5 15.4 14.3 ...
$ lmefits     : num  15.2 15.3 15 15.8 14.3 ...
$ lmeres      : num  0.271 -0.18 -0.342 -0.424 -0.053 ...
$ TotWts      : num  2292 2292 2292 2292 2292 ...
$ GrpWts      : num  763 472 763 472 779 ...
$ NumSubjects : int  525 561 525 561 613 613 525 550 550 525 ...
$ TotSubjects : int  2249 2249 2249 2249 2249 2249 2249 2249 2249 2249 ...
$ Male.var    : num  0.419 0.337 0.419 0.337 0.267 ...

As you can see from my Male.DistF data, for the 100 replicates for the first observation, in the Male.MC data frame I would like 100020 as the RespondentID (and not 1) and 9to13 as the AgeFactor (and not 3). Where have I gone wrong with my output instructions and how do I fix this? In particular, I'm not following why my attempts to use as.character went so badly astray as to affect the entire output. As an aside I would also welcome suggestions to speed up the loops. All I am doing is constructing 100 sets of values for each observation in my Male.DistF data frame.

Solution

You could try to replace the line

z <- c(...

which creates the new row as a vector, i.e., forces all elements to have the same type, with a 1-row data.frame, to keep the type of the columns.

z <- data.frame(
  RespondentID = Male.DistF$RespondentID[i], 
  AgeFactor    = Male.DistF$AgeFactor[i], 
  SampleWeight = Male.DistF$SampleWeight[i], 
  VarByAge     = Male.DistF$Male.var[i], 
  lmefits      = Male.DistF$lmefits[i], 
  u2           = u2, 
  mc_amount    = mc_amount
)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow