You have already learned two ways to go about your problem. The complicated one is to split each of the strings into words, sort them and then weed out duplicates, which is easy in a sorted array. The easier one is to split the first string into words, search for each word in the second. Then do the same the other way round: split the second and check for words in the first.
Both approaches require that you split the strings. That's also where you seem to have problems in your code. (You've got the basic idea to look at word boundaries, but you don't seem to know how to store the words.)
The basic question is: How are you going to represent the words, i.e. the substrings of a C string? There are various ways. You could use pointers into the string together with a string length or you could copy them into another buffer.
Here is a sloution that splits the string a
into words and then checks whether each word can be found in b
:
/*
* Return 1 if all words in a can be found in b,
* return 0 otherwise.
*/
int split_and_check(const char *a, const char *b)
{
int begin = -1; /* marker for beginning of word */
char word[80]; /* temporary buffer for current word */
int prev = 0; /* previously read char to detect word bounaries */
int len; /* current length of word */
int i;
i = 0;
while (1) {
if (isalpha(a[i])) {
if (!isalpha(prev)) {
begin = i;
len = 0;
}
if (len < 80) word[len++] = a[i];
} else {
if (len > 0) {
word[len] = '\0'; /* manually null-terminate word */
if (strstr(b, word) == NULL) {
/* fail on string mismatch */
return 0;
}
len = 0; /* reset word-length counter */
}
}
if (a[i] == '\0') break; /* check end here to catch last word */
prev = a[i++];
}
return 1;
}
The current word is stored in the local char buffer word
and has the length len
. Note how the zero end marker '\0'
is added to word
manually before searching b
for word
: The library function strstr
looks for a string in another one. Both strings must be zero-terminated.
This is only one half of the solution. You must check the strings the other way round:
int same_words(const char *a, const char *b)
{
if (split_and_check(a, b) == 0) return 0;
if (split_and_check(b, a) == 0) return 0;
return 1;
}
This is not yet the exact solution to your problem, because the string matching is done case-sensitively. I've skipped this part, because it was easier that way: strstr
is case sensitive and I don't know of any variants that ignore the case.