Question

In the Programming Challenge in the Common Child , i have taken a different approach than the general Longest Common Substring problem. The Code is

#include <cmath>
#include <cstdio>
#include <vector>
#include<string>
#include <iostream>
#include <algorithm>
using namespace std;

void max(string a,string b,int n)
{
    int count=0,x=-1,prev=0,i,j,k;
    for(i=0;i<n;i++)
    {
        x=-1;
        for(j=i;j<n;j++)
        {
            for(k=x+1;k<n;k++)
            {
                if(a[j]==b[k])
                {
                    count++;
                    x=k;
                    break;
                }
            }
        }
        if(prev<count)
        {
            prev=count;
        }
        count=0;        
    }
    cout<<prev;
}

int main() {
    string a,b;
    int n;
    cin>>a;
    cin>>b;
    n=a.length();
    max(a,b,n);
    return 0;

Taking Smaller TestCases i am being able to get the solution but i am not able to get the solution for longer ones.

What i did is

Input: ABCDEF - Let it be a FBDAMN - Let it be b, as it is inserted in the code. Output: 2. ie. for BD being the substring.

So what i am doing here is checking the matchable substring for each substring of a, and print the largest. ie. substring for ABCDEF, BCDEF,CDEF,DEF,EF,F which signifies the outermost loop here.(loop i)

Now the Second loop matches for each letter in string ie. it iterates for (1...n),(2...n),(3...n),...,(n), meaning it starts from the letter i specifies.

Now the third loop iterates through the whole of string B to see if a[j] matches with any letter in B, if yes then count increments.

Since We can only remove letters from the substring and not jumble it, i.e. each letter should have the same relative order in both the strings, I am searching B after the previous letter was found, using x.

Dry run would be like for the example(arrays from 0 to n-1)

a= abcdef b= fbdamn

when i=0 - the whole string is getting matched abcdef

x=-1 so k iterates from 0 to n-1 So, a=f? a=b? a=d? a=a? count = count+1; so x is set as 3(position of a). elements after a in A can only occur after a in B so x=3. therefore k iterates from 4 to 5 b=m? b=n? c=m?c=n? .... ... ...

now we save the value of previous count if it is greater than counts before. so maxcount=count and then reset count to 0, and reset position x=-1.

i=1 i.e string = BCDEF k from 0 to n So, B=F? b=b? count=count+1 // becomes 1 x is set as 1(position of B) k from 2 to n c=d? c=a?c=m?c=n? k from 2 to n d=d? count = count+1 // becomes 2 x is set as 2 k from 3 to n e=a?e=m?e=n? k from 3 to n f=a?f=m?f=n? then if max count(prev in code) greater than current count, then maxcount = count, reset count = 0, x=-1; then it goes like this for CDEF,DEF,EF,F the max substring here becomes thus 2

Works for the shorter testcases just not for the longer ones.

The test cases it works for are:

OUDFRMYMAW AWHYFCCMQX

With Output 2

Doesnt Work For

WEWOUCUIDGCGTRMEZEPXZFEJWISRSBBSYXAYDFEJJDLEBVHHKS FDAGCXGKCTKWNECHMRXZWMLRYUCOCZHJRRJBOAJOQJZZVUYXIC

With correct output being 15 and my output being 10. Can Anyone explain to me where am i making a mistake in the

Was it helpful?

Solution 2

The problem is that your algorithm uses a naive greedy approach: it makes a match as soon as it sees it, and never reconsiders its decision after that, until it gets to the next substring. It can be demonstrated with a counterexample that this greedy strategy does not work for the LCS: consider strings

A = "abcd"
B = "acdb"

Your algorithm returns 2 because it funds "ab", while the longest subsequence is "acd", with the length of 3.

These two strings are carefully constructed to trick your algorithm into producing an incorrect result. Your algorithm would match the 'a's at the initial position, advance to the second character 'b' of string A, and match it at the last position of string B. At this point, your algorithm is doomed: it wouldn't find 'c' and 'd', because all characters of B are exhausted. What it should have done is to as if it can do better by backtracking on the decision to match 'b'. The answer to this one is a resounding "yes": if we skip the second character of A, we would match both 'c' and 'd', for the correct result of 3.

A proper implementation of LCS (with DP or with memoization) does this backtracking without growing the time exponentially. It is one of the simplest DP algorithms to study and to implement, so you should have no trouble applying it to solve the problem at hand.

OTHER TIPS

One problem that I foresee is as follows:

If a = ABCDEFG and b = ACDEFGB,

Now it will first match A, then it will match B, but then k will become 6 already. Hence, further letters CDEFG will not be matched.

Hence the possible string ACDEFG shall be skipped.

Your code shall return the maximum common child as CDEFG.

EDIT: This is not a longest common substring problem but a longest common subsequence problem. http://www.geeksforgeeks.org/dynamic-programming-set-4-longest-common-subsequence/

import java.util.Scanner;
  public class JavaApplication8 {
      public static int find(String s1,String s2){
        int n = s1.length();
        int m = s2.length();
        int ans = 0;
        int[] a = new int[m];
        int b[] = new int[m];
        for(int i = 0;i<n;i++){
            for(int j = 0;j<m;j++){
                if(s1.charAt(i)==s2.charAt(j)){
                   if(i==0 || j==0 )a[j] = 1;
                   else{
                       a[j] = b[j-1] + 1;
                   }
                   ans = Math.max(ans, a[j]);
                }

            }
            int[] c = a;
            a = b;
            b = c;
        }
        return ans;
    }
    public static void main(String[] args) {
        Scanner sc = new Scanner(System.in);
        String s1 = sc.next();
        String s2 = sc.next();
        System.out.println(find(s1,s2));
    }

}

Time Complexity O(N). Space Complexity O(N).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top