Plotting inside function: subset(df,id_==...) gives wrong plot, df[df$id_==...,] is right

Question 1

This works fine:

plot_series <- function(id_) {
    dat <- df[df$id_ == id_,]
    p <- ggplot(data=dat, mapping=aes(x,y), color='red') + geom_step()
    return(p)
}

print(plot_series(20))

If you simply step through the original function using debug, you'll quickly see that the subset line did not actually subset the data frame at all: it returned all rows!

Why? Because subset uses non-standard evaluation and you used the same name for both the column name and the function argument. As jlhoward demonstrates above, it would have worked (but probably not been advisable) to have simply used different names for the two.

The reason is that subset evaluates with the data frame first. So all it sees in the logical expression is the always true id_ == id_ within that data frame.

One way to think about it is to play dumb (like a computer) and ask yourself when presented with the condition id_ == id_ how do you know what exactly each symbol refers to. It's ambiguous, and subset makes a consistent choice: use what's in the data frame.

Question 2

Notwithstanding the comments, this works:

plot_series <- function(z, envir=environment()) {
  dat <- subset(df,id_==z)
  p <- ggplot(data=dat, mapping=aes(x,y), color='red') + geom_step()
  p$plot_env <- envir
  plot(p)
  # Displays wrongly whether we do the plot here inside fn, or return the object to parent environment 
  return(p)
}

plot_series(20)

The problem seems to be the subset is interpreting id_ on the RHS of the == as identical to the LHS, to this is equivalent to subletting on T, which of course includes all the rows of df. That's the plot you are seeing.