Question

I have a sentence, for example

John Doe moved to New York last year.

Now I split the sentence into the single words and I get:

array('John', 'Doe', 'moved', 'to', 'New', 'York', 'last', 'year')

That's quite easy. But then I want to combine the single words to get all the composed terms. It doesn't if the composed terms make sense, I want to get all of them, though. The result of that operation should look like this:

John, Doe, John Doe, moved, Doe moved, John Doe moved, to, moved to, Doe moved to ...

The words should be composed to terms to a limit of k parts. In the example above, the limit is 3. So a term can contain 3 words at most.

The problem: How could I code the composition in PHP? It would be great if I had a function which gets a sentence as the input and gives an array with all terms as the output.

I hope you can help me. Thanks in advance!

Was it helpful?

Solution

If you already have the code for splitting the words into an array, this function will let you select the longest you wish your phrases to be, and return to you an array of arrays containing your phrases.

function getPhrases($array, $maxTerms = 3) {
    for($i=0; $i < $maxTerms; $i++) { //Until we've generated terms of all lengths
         for($j = 0; $j < (sizeof($array) - $i); $j++) { //Until we've iterated as far through the array as we should go
             $termArray[] = array(array_slice($array, $j, ($i+1))); //Add this part of the array to the array
         }
    }
    return $termArray;
}

//Usage example

$newarray = explode(" ", "This is a pretty long example sentence");
print_r(getPhrases($newarray));

OTHER TIPS

Every composition will be defined by a starting point and a length - just loop through.

PHP won't help you all the way, but it does have some handy functions.

$words = explode(" ", $sentence);
for ($start = 0; $start < count($words); $start++) //starting point
{
   //try all possible lengths
   //limit = max length
   //and of course it can't overflow the string
   for ($len = 1; $len <= $limit && $len <= count($words)-$start; $len++)
   {
      //array_slice gets a chunk of the array, and implode joins it w/ spaces
      $compositions[] = implode(" ", array_slice($words, $start, $len));
   }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top