How many operations of flipping all brackets on a substring of a string of brackets are needed to make the string 'correct'?

cs.stackexchange https://cs.stackexchange.com/questions/119847

  •  28-09-2020
  •  | 
  •  

Question

I call a string of brackets 'correct' if every opening bracket has a corresponding closing bracket somewhere after it and every closing bracket has a corresponding opening bracket somewhere before it. For example

(())()((()()))

is a correct string and

)()()(

is an incorrect string.

An operation consists of taking a contiguous subsequence of the input string and 'flipping' every bracket in that subsequence.

Given an arbitrary string of brackets (of even length), the task is to find the smallest number of such operations needed to change it to a 'correct' string of brackets.

Another way to look at this problem is to have a string of +1s and -1s, and to try to find the smallest number of operations (of multiplying every number on an interval by -1) needed to make every prefix of this string nonnegative and every suffix nonpositive.

Was it helpful?

Solution

The DP suggested in the comments by Yuval Filmus indeed works: we can solve the problem in $\mathcal{O}(n^{2})$ by DP.

First note that we may assume that intervals are not allowed to overlap. Take any two overlapping intervals $[a, b)$ and $[c, d)$. WLOG $a \leq c$. Then we can replace the two intervals with the intervals $[a, c)$ and $[\min(b, d), \max(b, d))$ that do not overlap, have shorter total length and cover exactly the same elements.

Replace opening brackets with $1$, closing brackets with $-1$. Now the problem corresponds to applying the minimum number of operations "multiply interval by $-1$" such that every prefix has nonnegative sum, and the sum over the entire array is zero.

We construct a solution left-to-right. At every position we can either start an interval, or end an active interval.

Let $DP[m][s][0]$ denote the minimum number of intervals we need to use such that every prefix sum in the first $m$ elements is nonnegative, the sum of the first $m$ elements is $s$, and we do not have an active interval. Similarly $DP[m][s][1]$ is the same value where we do have an active interval.

If the $m$th value is $1$, for $0 \leq s \leq m$ we set

  • $DP[m][s][0] = DP[m-1][s-1][0]$
  • $DP[m][s][1] = DP[m-1][s+1][1]$

Otherwise we set

  • $DP[m][s][0] = DP[m-1][s+1][0]$
  • $DP[m][s][1] = DP[m-1][s-1][1]$.

Then we end and start intervals, setting

  • $DP[m][s][0] = \min(DP[m][s][0], DP[m][s][1])$
  • $DP[m][s][1] = \min(DP[m][s][1], DP[m][s][0] + 1)$

The base case is $DP[0][0][0] = 0$, $DP[0][0][1] = 1$.

There are $\mathcal{O}(n^{2})$ states, and we do $\mathcal{O}(1)$ work to calculate each of them, therefore the algorithm runs in $\mathcal{O}(n^{2})$. The DP can be implemented to use $\mathcal{O}(n)$ memory by only storing the values of $DP$ for the current $m$.

Here is a C++ implementation of the algorithm.

#include <iostream>
#include <vector>
using namespace std;

int main() {
    string str;
    cin >> str;
    int n = str.size();

    vector<int> dp0 = {0}, dp1 = {1};
    for (int m = 1; m <= n; ++m) {
        vector<int> nxt0(m + 1, n + 1), nxt1(m + 1, n + 1);
        for (int i = 0; i < m; ++i) {
            if (str[m-1] == '(') {
                nxt0[i+1] = dp0[i];
                if (i > 0) nxt1[i-1] = dp1[i];
            } else {
                nxt1[i+1] = dp1[i]);
                if (i > 0) nxt0[i-1] = dp0[i];
            }
        }
        dp0 = nxt0;
        dp1 = nxt1;
        for (int i = 0; i <= m; ++i) dp1[i] = min(dp1[i], dp0[i] + 1); // Start interval
        for (int i = 0; i <= m; ++i) dp0[i] = min(dp0[i], dp1[i]); // Stop interval
    }
    cout << dp0[0] << '\n';
}

OTHER TIPS

Nice question!

Here is the problem in more popular terms.

Balanced Parentheses

A string of parentheses is balanced if we can change it to the empty string by removing substrings "()" repeatedly. For example, empty string, "()" and "(())()((()()))" are balanced but none of "())" and ")()()(" is balanced.

It is clear that a balanced string must start with "(" and ending with ")". It must be of even length as well.

the Problem of Minimum Operations to Balance a String

Let s be a non-empty string of parentheses of length n. The only operation we can perform is flipping every parenthesis in a substring, that is, changing every "(" to ")" and every ")" to "(" for some contiguous characters in the string. The problem is how to find the smallest number of operations that make s balanced.

an Approach by Dynamic Programming

What are the appropriate subproblems? They are dp[i][j] for 0 <= i < j < n, where dp[i][j] is the minimum number of operations that balance the substring between indice i and index j inclusive. The wanted minimum number of operations that balance s is dp[0][n-1].

Since any string of odd length cannot be balanced, we will restrict our attention to strings of even length from now on.

the base cases dp[i][i+1] is 0 if the substring between i and i+1 is balanced, i.e, "()". It is 1 otherwise.

the recurrence relation I will use Java to define the recurrence relation.

void dp(int i, int j) {
    assert 0 <=i && i + 2 < j && j <= L;

    int min = dp[i + 1][j - 1];
    if (s.charAt(i) == ')' && s.charAt(i + 1) == '(') min++;
    if (s.charAt(j) == '(' && s.charAt(j - 1) == ')') min++;

    for (int k = i + 1; k < j - 1; k+=2) {
        if (a[k] == 1 && a[k + 1] == -1) {
            if (min > dp[i][k] + dp[k + 1][j] - 1) {
                min = dp[i][k] + dp[k + 1][j] - 1;
            }
        } else {
            if (min > dp[i][k] + dp[k + 1][j]) {
                min = dp[i][k] + dp[k + 1][j];
            }
        }
    }

    dp[i][j] = min;
}

The recurrence relation comes from the following well-known property of balanced strings. A non-empty string is balanced if and only if it is the concatenation of two balanced strings or "(" followed by a balanced string followed by ")".

The processing before the for loop, which sets min to be dp[i + 1][j - 1] plus 0, 1 or 2, comes from the fact that "(" followed by a balanced string followed by ")" must be a balanced string.

The processing in the for loop, which tries to shrink min to dp[i][k] + dp[k + 1][j] minus 0 or 1 for some k that is between i and j, comes from the fact that the concatenation of two balanced strings must be a balanced string.

The time-complexity of this approach is $O(n^3)$. The space-complexity is $O(n^2)$.

Length of the "most unbalanced" strings

The smallest length of a string that cannot be balanced by less than 2 operations is 4. For example, ")((("

The smallest length of a string that cannot be balanced by less than 3 operations is 10. For example, ")(((((())("

The smallest length of a string that cannot be balanced by less than 4 operations is 22. For example, ")(())))(((((((((((())(".

Question: What is the smallest length of a string that needs at least 5 operations? As I have computed, it must be greater than 28. However, my computation is not enough to determine the actual value. How fast can we compute these smallest lengths?

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top