Change your models to, e.g.:
reg5 <- glm(survived ~ pclass_str + sex + age_2 + sibsp + pclass_str*sex,
data=train, family = "binomial")
reg6 <- randomForest(survived_str ~ pclass_str + sex + age_2 + sibsp,
data=train, ntree=5000)
There may be another problem with your model specification in that reg5
uses survived ~...
and reg6
uses survived_str ~...
, but I can't tell from your question if this is an issue.
Finally, as @Roland points out, you can simplify your formulas. If you're going to do this a lot, read the documentation on formula in R (?formula
). In R formulas, interactions are built by specifying a:b
. The notation a*b
is equivalent to a + b +a:b
(e.g., first order terms + their interaction). So, specifying pclass_str*sex
is equivalent to specifying pclass_str + sex + pclass_str:sex
.