Function that finds whether 2 string are made from same words

Question 1

You have already learned two ways to go about your problem. The complicated one is to split each of the strings into words, sort them and then weed out duplicates, which is easy in a sorted array. The easier one is to split the first string into words, search for each word in the second. Then do the same the other way round: split the second and check for words in the first.

Both approaches require that you split the strings. That's also where you seem to have problems in your code. (You've got the basic idea to look at word boundaries, but you don't seem to know how to store the words.)

The basic question is: How are you going to represent the words, i.e. the substrings of a C string? There are various ways. You could use pointers into the string together with a string length or you could copy them into another buffer.

Here is a sloution that splits the string a into words and then checks whether each word can be found in b:

/*
 *      Return 1 if all words in a can be found in b, 
 *      return 0 otherwise.
 */
int split_and_check(const char *a, const char *b)
{
    int begin = -1;    /* marker for beginning of word */
    char word[80];     /* temporary buffer for current word */
    int prev = 0;      /* previously read char to detect word bounaries */
    int len;           /* current length of word */
    int i;

    i = 0;
    while (1) {
        if (isalpha(a[i])) {
            if (!isalpha(prev)) {
                begin = i;
                len = 0;
            }
            if (len < 80) word[len++] = a[i];
        } else {
            if (len > 0) {
                word[len] = '\0';       /* manually null-terminate word */

                if (strstr(b, word) == NULL) {
                    /* fail on string mismatch */
                    return 0;
                }
                len = 0;                /* reset word-length counter */
            }
        }
        if (a[i] == '\0') break;        /* check end here to catch last word */
        prev = a[i++];
    }

    return 1;
}

The current word is stored in the local char buffer word and has the length len. Note how the zero end marker '\0' is added to word manually before searching b for word: The library function strstr looks for a string in another one. Both strings must be zero-terminated.

This is only one half of the solution. You must check the strings the other way round:

int same_words(const char *a, const char *b)
{    
    if (split_and_check(a, b) == 0) return 0;
    if (split_and_check(b, a) == 0) return 0;

    return 1;
}

This is not yet the exact solution to your problem, because the string matching is done case-sensitively. I've skipped this part, because it was easier that way: strstr is case sensitive and I don't know of any variants that ignore the case.

Question 2

You are returning nothing from your function sameWords whose return type is int.

Question 3

I don't pretend to be awarded as the answer, but I would take a look at regular expressions too for this kind of things.

Does C or C++ have a standard regex library?

It would take minutes to solve it, you split the string with regex, lowercase-it, and then iterate to look after common words.

Question 4

What I would do to solve this problem is create a data structure like a tree into which you can insert words. The insert function would do nothing if the word is already there, otherwise, it would convert it to lowercase and insert it in the tree. Then you could simply convert both strings to these types of trees and compare the trees.

Another way to do this is in bash. While this is probably not allowed for you assignment, if you understand how and why it works, you should be able to code something up that mimics it:

# string1 and string2 are simply strings with spaces separating words
s1="dog dog dog cat"
s2="cat dog"

# Convert to arrays
a1=( $(printf "%s\n" ${s1}  | sort | uniq ) )
a2=( $(printf "%s\n" ${s2}  | sort | uniq ) )

# Compare the result
if [ "${a1[*]}" == "${a2[*]}" ] ; then
  echo "Same"
fi