Condition of Equality: If the long string contains a sub-set that is equivalent to the set of words as the short string with each equivalent string have similarity > 75%.
I will be working in Java:
String str1 = "john rambo the vietnam veteran";
String str2 = "jovn rabbo the vittnam us vetteran";
The first step is to transform the text into word-bag:
ArrayList<String> a = new ArrayList<String> (Arrays.asList(str1.split(" ")));
ArrayList<String> b = new ArrayList<String> (Arrays.asList(str2.split(" ")));
Next is to start from the list that has the least number of words, and cross-check them in the other bag:
boolean are_equal = true;
boolean word_found = false;
if (a.size() < b.size())
{
for (String a_word : a)
{
word_found = false;
for (int i=0; i<b.size(); i++)
{
String b_word = b.get(i);
if (is_similar_enough (a_word, b_word))
{
word_found = true;
b.remove(i);
break;
}
}
if (!word_found)
{
are_equal = false;
break;
}
}
}
else
{
// ..
}
Condition of Word Similarity: Let's say that two words are similar if they both have the same number of characters, and the number of equivalent (equivalent means symmetrically similar) characters make of 75% or more.
but before that we have to define is_similar_enough function:
public boolean is_similar_enough (String a, String b)
{
int equivlantChars = 0;
if (a.length() != b.length())
return false;
for (int i=0; i<a.length(); i++)
if (a.toCharArray[i] == b.toCharArray[i])
equivlantChars ++;
return ((((double)equivlantChars) / ((double)a.length())) >= 0.75);
}