Minimum window in String 1 containing all characters from String 2 and no character from String 3

StackOverflow https://stackoverflow.com/questions/23228960

Domanda

Ok, this is an interview question. And no it's not a duplicate of this question.

Given 3 strings - str1, str2, str3:

str1 = "spqrstrupvqw"
str2 = "sprt"
str3 = "q"

We've to find the minimum window in str1, which contains all characters from str2 in any order, but no character from str3. In this case the answer would be: "strup".

I've come up with this code:

static String minimumWindow(String str1, String str2, String str3) {

        class Window implements Comparable<Window> {
            int start;
            int end;

            public Window(int start, int end) {
                this.start = start;
                this.end = end;
            }

            public int getEnd() {
                return end;
            }

            public int getStart() {
                return start;
            }

            public int compareTo(Window o) {
                int thisDiff = end - start;
                int thatDiff = o.end - o.start;

                return Integer.compare(thisDiff, thatDiff);
            }

            @Override
            public String toString() {
                return "[" + start + " : " + end + "]";
            }
        }

        // Create Sets of characters for "contains()" check

        Set<Character> str2Chars = new HashSet<>();
        for (char ch: str2.toCharArray()) {
            str2Chars.add(ch);
        }

        Set<Character> str3Chars = new HashSet<>();
        for (char ch: str3.toCharArray()) {
            str3Chars.add(ch);
        }

        // This will store all valid window which doesn't contain characters
        // from str3.
        Set<Window> set = new TreeSet<>();

        int begin = -1;

        // This loops gets each pair of index, such that substring from 
        // [start, end) in each window doesn't contain any characters from str3
        for (int i = 0; i < str1.length(); i++) {
            if (str3Chars.contains(str1.charAt(i))) {
                 set.add(new Window(begin, i));
                 begin = i + 1;
            }
        }

        int minLength = Integer.MAX_VALUE;
        String minString = "";

        // Iterate over the windows to find minimum length string containing all
        // characters from str2
        for (Window window: set) {
            if ((window.getEnd() - 1 - window.getStart()) < str2.length()) {
                continue;
            }

            for (int i = window.getStart(); i < window.getEnd(); i++) {
                if (str2Chars.contains(str1.charAt(i))) {
                      // Got first character in this window that is in str2
                      // Start iterating from end to get last character
                      // [start, end) substring will be the minimum length
                      // string in this window
                     for (int j = window.getEnd() - 1; j > i; j--) {
                        if (str2Chars.contains(str1.charAt(j))) {
                            String s = str1.substring(i, j + 1);

                            Set<Character> sChars = new HashSet<>();
                            for (char ch: s.toCharArray()) {
                                sChars.add(ch);
                            }

                            // If this substring contains all characters from str2, 
                            // then only it is valid window.
                            if (sChars.containsAll(str2Chars)) {
                                int len = sChars.size();
                                if (len < minLength) {
                                    minLength = len;
                                    minString = s;
                                }
                            }
                        }
                    }
                }
            }
        }

    // There are cases when some trailing and leading characters are
    // repeated somewhere in the middle. We don't need to include them in the
    // minLength. 
    // In the given example, the actual string would come as - "rstrup", but we
    // remove the first "r" safely.
        StringBuilder strBuilder = new StringBuilder(minString);

        while (strBuilder.length() > 1 && strBuilder.substring(1).contains("" + strBuilder.charAt(0))) {
            strBuilder.deleteCharAt(0);
        }

        while (strBuilder.length() > 1 && strBuilder.substring(0, strBuilder.length() - 1).contains("" + strBuilder.charAt(strBuilder.length() - 1))) {
            strBuilder.deleteCharAt(strBuilder.length() - 1);
        }

        return strBuilder.toString();
    }

But it doesn't work for all the test cases. It does work for the example given in this question. But when I submitted the code, it failed for 2 test cases. No I don't know the test cases for which it failed.

Even after trying various sample inputs, I couldn't find a test case for which it fails. Can someone take a look as to what is wrong with the code? I would really appreciate if someone can give a better algorithm (Just in pseudo-code). I know this is really not the optimized solution though.

È stato utile?

Soluzione

str1 = "spqrstrupvqw"
str2 = "sprt"
str3 = "q"

We're looking for the minimum sub-string from str1 that contain all str2 characters (assume ordered) and no characters from str3 ..

i = 1 .. str1.length
cursor = 1 .. str2.length

The solution must be on the form:

str2.first X X .. X X str2.last

So to check for that sub-string we use a cursor over str2, but we also have the constraint of avoiding str3 characters, so we have:

if str3.contain(str1[i])
    cursor = 1
else
    if str1[i] == str2[cursor]
        cursor++

Goal check is:

if cursor > str2.length
    return solution
else
    if i >= str1.length
        return not-found

And for optimization, you can skip to the next look-ahead which is:

look-ahead = { str2[cursor] or { X | X in str3 }}

In case str2 is not ordered:

i = 1 .. str1.length
lookup = { X | X in str2 }

The solution must be on the form:

str2[x] X X .. X X str2[x]

So to check for that sub-string we use a check-list str2, but we also have the constraint of avoiding str3 characters, so we have:

if str3.contain(str1[i])
    lookup = { X | X in str2 }
else
    if lookup.contain(str1[i])
        lookup.remove(str1[i])

Goal check is:

if lookup is empty
    return solution
else
    if i >= str1.length
        return not-found

And for optimization, you can skip to the next look-ahead which is:

look-ahead = {{ X | X in lookup } or { X | X in str3 }}

Code

class Solution
{
    private static ArrayList<Character> getCharList (String str)
    {
        return Arrays.asList(str.getCharArray());
    }

    private static void findFirst (String a, String b, String c)
    {
        int cursor = 0;
        int start = -1;
        int end = -1;

        ArrayList<Character> stream = getCharList(a);
        ArrayList<Character> lookup = getCharList(b);
        ArrayList<Character> avoid = getCharList(c);

        for(Character ch : stream)
        {
            if (avoid.contains(ch))
            {
                lookup = getCharList(b);
                start = -1;
                end = -1;
            }
            else
            {
                if (lookup.contains(ch))
                {
                    lookup.remove(ch)

                    if (start == -1) start = cursor;

                    end = cursor;
                }
            }

            if (lookup.isEmpty())
                break;

            cursor++;
        }

        if (lookup.isEmpty())
        {
            System.out.println(" found at ("+start+":"+end+") ");
        }
        else
        {
            System.out.println(" not found ");
        }
    }
}

Altri suggerimenti

Here is working Java code tested on various test cases.

The algorithm basically uses a sliding window to examine different windows within which an answer may lie. Each character in the string str2 is analyzed at most twice. Thus the running time of the algorithm is linear, ie O(N) in the lengths of the three strings. This is infact the most optimal solution for this problem.

String str1 = "spqrstrupvqw";
String str2 = "sprt";
String str3 = "q";
char[] arr = str1.toCharArray();
HashSet<Character> take = new HashSet<Character>();
HashSet<Character> notTake = new HashSet<Character>();
HashMap<Character, Integer> map = new HashMap<Character, Integer>();

void run()throws java.lang.Exception{
    System.out.println(str1 + " " + str2 + " " + str3);

    //Add chars of str2 to a set to check if a char has to be taken in O(1)time.
    for(int i=0; i<str2.length(); i++){
        take.add(str2.charAt(i));
    }

    //Add chars of str3 to a set to check if a char shouldn't be taken in O(1) time.
    for(int i=0; i<str3.length(); i++){
        notTake.add(str3.charAt(i));
    }

    int last = -1;
    int bestStart = -1;
    int bestLength = arr.length+1;

    // The window will be from [last....next]

    for(int next=last+1; next<arr.length; next++){
        if(notTake.contains(arr[next])){ 
            last = initLast(next+1); //reinitialize the window's start. 
            next = last;
        }else if(take.contains(arr[next])){
            // take this character in the window and update count in map.
            if(last == -1){
                last = next;
                map.put(arr[last], 1);
            }else{
                if(!map.containsKey(arr[next])) map.put(arr[next], 1);
                else          map.put(arr[next], map.get(arr[next])+1);
            }
        }

        if(last >= arr.length){ // If window is invalid
            break;
        }

       if(last==-1){
            continue;
        }

        //shorten window by removing chars from start that are already present.
        while(last <= next){
            char begin = arr[last];

            // character is not needed in the window, ie not in set "take"
            if(!map.containsKey(begin)){
                last++;
                continue;
            }

            // if this character already occurs in a later part of the window
            if(map.get(begin) > 1){
                last++;
                map.put(begin, map.get(begin)-1);
            }else{
                break;
            }
        }

        // if all chars of str2 are in window and no char of str3 in window, 
// then update bestAnswer
        if(map.size() == str2.length()){
            int curLength = next - last + 1;
            if(curLength < bestLength){
                bestLength = curLength;
                bestStart  = last;
            }
        }
    }

    if(bestStart==-1){
        System.out.println("there is no such window");
    }else{
        System.out.println("the window is from " + bestStart + " to " + (bestStart + bestLength-1));
        System.out.println("window " + str1.substring(bestStart, bestStart+bestLength));
    }

}

// Returns the first position in arr starting from index 'fromIndex'
// such that the character at that position is in str2.
int initLast(int fromIndex){

    // clear previous mappings as we are starting a new window
    map.clear();
    for(int last=fromIndex; last<arr.length; last++){
        if(take.contains(arr[last])){
            map.put(arr[last], 1);
            return last;
        }
    }
    return arr.length;
}

Moreover, your code fails on many trivial test cases. One of them is when str1 = "abc", str2 = "ab", str3 = "c".

PS. If you are having a hard time understanding this code, first try reading this easier post which is very similar to the problem that has been asked.

What about using a regular expression?

String regex = ".*((?=[^q]*s)(?=[^q]*p)(?=[^q]*r)(?=[^q]*t)[sprt][^q]+([sprt])(?<!ss|pp|rr|tt))";

Matcher m = Pattern.compile(regex).matcher("spqrstrupvqw");

while (m.find()) {
    System.out.println(m.group(1));
}

This prints out:

strup

This can also be wrapped in a method which generates dynamically the regular expression for variable inputs:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class MatchString {
    public static void main(String[] args) {
        System.out.println(getMinimalSubstrings("spqrstrupvqw", "sprt", "q"));
        System.out.println(getMinimalSubstrings("A question should go inside quotations.", "qtu", "op"));
        System.out.println(getMinimalSubstrings("agfbciuybfac", "abc", "xy"));
    }

    private static List<String> getMinimalSubstrings(String input, String mandatoryChars, String exceptChars) {
        List<String> list = new ArrayList<String>();
        String regex = buildRegEx(mandatoryChars, exceptChars);

        Matcher m = Pattern.compile(regex).matcher(input);

        while (m.find()) {
            list.add(m.group(1));
        }

        return list;
    }

    private static String buildRegEx(String mandatoryChars, String exceptChars) {
        char[] mandChars = mandatoryChars.toCharArray();
        StringBuilder regex = new StringBuilder("[^").append(exceptChars).append("]*(");

        for (char c : mandChars) {
            regex.append("(?=[^").append(exceptChars).append("]*").append(c).append(")");
        }

        regex.append("[").append(mandatoryChars).append("][^").append(exceptChars).append("]+([").append(mandatoryChars).append("])(?<!");

        for (int i = 0; i < mandChars.length; i++) {
            if (i > 0) {
                regex.append("|");
            }

            regex.append(mandChars[i]).append(mandChars[i]);
        }

        regex.append("))");

        return regex.toString();
    }
}

This prints out:

[strup]
[quest]
[agfbc, bfac]
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top