Question

I am trying to generate functions combining n gaussians, and using values retrieved from a nls run. I use gsub to replace the original coefficients with the nls ones using backreferences. However, it seems that [ on the datafame evaluates before the \\1.

Here is a MWE :

nls <- data.frame(Estimate = seq(1,3))
row.names(nls) <- c("a","b","c")
gsub("(a|b|c)",paste0(" ",nls["\\1","Estimate"]," "),"a + b*x + c*x^2")

As you can see, the replacements are NAs, while the call to the nls dataframe appear to be valid :

gsub("(a|b|c)",paste0(" ","\\1","Estimate"," "),"a + b*x + c*x^2")

Any ideas to delay the evaluation of [ ?

Thanks !

EDIT : for the sake of clarity, here is the full function now working great (it takes number of peaks, formula of one peak, parameters in the formula, variable, constant boolean, and nls results as arguments, and returns the formula for use in ggplot's stat_function() :

Generate_func <- function(peakNb,peakForm,peakParams, peakVar, constBool,nls){
  res <- as.data.frame(summary(nls)$coefficients, optional = T)
  rhs <- strsplit(peakForm, "~")[[1]][[2]]
  regex <- paste0("([*+-/\\^\\(\\)[:space:]]|^)(",paste0(peakParams, collapse = "|"),")([*+-/\\^\\(\\)[:space:]]|$)")
  exp_names <- paste0(sapply(seq(1,peakNb),function(i){
    paste0(sapply(peakParams, function(j){
      paste0(j,i)
    }))
  }))
  if(constBool){exp_names <- c("C", exp_names)}
  func_text <- paste0(sapply(seq(1,peakNb),function(n){gsubfn(regex, x + y + z ~ paste0(x,res[paste0(y,n),"Estimate"],z), rhs )}), collapse = " + ")
  func_text <- paste0(ifelse(constBool,paste0(res["C","Estimate"]," + "),""), func_text)

  func <- function(x){
    eval(parse(text = func_text))
  }
  names(formals(func)) <- c(peakVar)

  print(func_text)

  func
}

And here is an usage example (nls data not included for length sake):

> testfunc <- Generate_func(3, "intensity_cnt ~ a * exp((-(energy_eV-b)^2)/(2*c^2))", c("a","b","c"), "energy_eV", constBool = T, testnls)
[1] "1000 +  32327.6598743022 * exp((-(energy_eV-1.44676439236578)^2)/(2*0.0349194350021539^2)) +  10000 * exp((-(energy_eV-1.49449385009962)^2)/(2*0.0102269096492807^2)) +  54941.8293572164 * exp((-(energy_eV-1.5321664735001)^2)/(2*0.01763494864617^2))"

Thank you for your help !

Was it helpful?

Solution

1) gsub replaces a pattern with a constant but what you are looking to do is to replace it with the result of applying a function to the matched string. gusbfn in the gsubfn package does that. Below, the formula in the second argument is just gsubfn's short form for a function whose argument is the left hand side and the body is the right hand side. Alternately the second argument could be expressed in the usual function notation ( function(x) nls[x,] ) but at the expense of a bit of verbosity:

> library(gsubfn)
> gsubfn("a|b|c", x ~ nls[x, ], "a + b*x + c*x^2")
[1] "1 + 2*x + 3*x^2"

Note that "a|b|c" could be derived from nls using paste(rownames(nls), collapse = "|") in order to avoid redundant specification.

2) Although gsubfn simplifies this significantly, to do it without gsubfn use substitute :

> L <- as.list(setNames(nls[[1]], rownames(nls)))  # L <- list(a = 1L, b = 2L, c = 3L)
> e <- parse(text = "a + b * x + c * x ^ 2")[[1]]  # e is the text as a "call" object
> s <- do.call(substitute, list(e, L))             # perform the substitution
> format(s)                                        # convert to character
[1] "1L + 2L * x + 3L * x^2"

The Ls are due to the fact that nls as defined in the question contains integers. Convert them to numeric before running the above if you don't like that:

nls[[1]] <- as.numeric(nls[[1]])

3) Another possibility is to loop over the strings to be substituted.

> s <- "a + b*x + c*x^2"
> for(nm in rownames(nls)) s <- gsub(nm, nls[nm, ], s)
> s
[1] "1 + 2*x + 3*x^2"

If we knew there was no more than one occurrence of each to be replaced we could use sub in place of gsub here.

UPDATE: Corrected second solution.

UPDATE 2: Added third solution.

OTHER TIPS

Here's another way to do it

gsub(paste0(row.names(nls), "(.*)", collapse=""),  paste0(t(nls),  paste0("\\", 1:nrow(nls)), collapse=""), "a + b*x + c*x^2"  )
[1] "1 + 2*x + 3*x^2"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top