Вопрос

I am building a shiny app where users will enter text in a textInput box, something like this:

str <- "1.1* max(a,b) * e* exp(1) * (sqrt(abs(x_1)))^(x2y) * min(maxy) * (b2^2)/sqrt(pi) /alpha + betax - gamma"

The text is a mathematical expression involving

  • real numbers: 1.1, 2
  • known mathematical constants: e, pi, gamma
  • known mathematical functions: max(), exp(), sqrt(), abs(), min()
  • variable names: a, b, x_1, x2y, maxy, b2, alpha, betax
  • arithmetic operators: +, -, *, /, ^
  • separators and similar: ( , ) .

In addition to pre-defined constants such as gamma, I will also construct a list of "forbidden" names, such as 'digits' or 'base', because they may appear as arguments to some of the mathematical functions.

My purpose is to extract from the expression all the valid variable names, and store them in a list:

(a, b, x_1, x2y, maxy, b2, alpha, betax)

or perhaps

("a", "b", "x_1", "x2y", "maxy", "b2", "alpha", "betax")

I have been able to deal with the situation where only single letters are admissible as variable names, the case where variable names contain only letters and are always separated by space, but I have not been able to deal with the situation described.

For what it's worth, here's my best shot:

# multi-gsub-to-1
mgsub21 <- function(pattern, replacement, x, ...) {
  for (i in 1:length(pattern)) {
    x <- gsub(pattern[i], replacement, x, ...)
  }
  x
}

ops <- list("\\+","\\-","\\*","\\/","\\^",",","\\(","\\)","\\.")
math <- list("log","logb","log10","log2","exp","sqrt"
  ,"cos","sin","tan","acos","asin","atan","atan2","cosh","sinh","tanh","acosh","asinh","atanh"
  ,"max","min","round","floor","ceiling","trunc","sign","abs","mean","median","mode","base","digits") 
cst <- list("pi","Pi","e","gamma")
pattern <- unlist(list(ops,cst,math))
replacement <- " "
str2 <- mgsub21(pattern,replacement,str)
str2
#  [1] "1 1    a b        xp 1         x_1     x2y       y     b2 2        alpha   b tax    "
str2 <- gsub("[0-9]+", " ", str2)
str2 <- gsub(" {2,}", " ", str2)
str2
#  [1] " a b xp x_ x y y b alpha b tax "

Problems with the above include:

  • betax becomes b tax after I remove e
  • exp(1) becomes xp 1 after I remove e
  • maxy becomes y after I remove max
  • does not deal with variables containing numbers

I get:

a b xp x_ x y alpha tax

Expected answer is:

a b x_1 x2y maxy b2 alpha betax

Suggestions welcome, thanks!

Это было полезно?

Решение

Start with:

instring <- "1.1* max(a,b) * e* exp(1) * (sqrt(abs(x_1)))^(x2y) * min(maxy) * (b2^2)/sqrt(pi) /alpha + betax - gamma"
(vars <- all.vars(parse(text=instring)))
##  [1] "a"     "b"     "e"     "x_1"   "x2y"   "maxy"  "b2"    "pi"    "alpha"
## [10] "betax" "gamma"

You didn't want "e", "pi", or "gamma", so:

constants <- c("e","pi","gamma")
setdiff(vars,constants)
## [1] "a"     "b"     "x_1"   "x2y"   "maxy"  "b2"    "alpha" "betax"
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top