Question

What is the time complexity of a regular expression like this (to be operated on any string): "(.{5})\1\1"

The implementation I am having is:

reps <- function(s, n) paste(rep(s, n), collapse = "") # repeat s n times

find.string <- function(string, th = 3, len = floor(nchar(string)/th)) {
    for(k in len:1) {
        pat <- paste0("(.{", k, "})", reps("\\1", th-1))
        r <- regexpr(pat, string, perl = TRUE)
        if (attr(r, "capture.length") > 0) break
    }
    if (r > 0) substring(string, r, r + attr(r, "capture.length")-1) else ""
}

Please do help out. Thanks! :)

Was it helpful?

Solution

It depends on the implementation. This is not a regular expression in the strict definition of the word because of the backreferences, but it looks like it is worst case O(15 * length(string))

Explanation: The regex engine will try to match starting from position 0,1,2,3,4..last position in the string. Since there is no constraint (dot character) it will match any first 5 characters and then will try to match them again twice, worst case doing 15 queries and then failing. Then it will move to the 2nd position in the string and try to do this all over again. So, in the worst case it will do this len(string) times.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top