How to use preg_match_all regex for HTML when HTML is in 1 row?

Question 1

By default, the + quantifier is greedy, meaning (loosely) that it will match as much as it can while the regex returns a overall match.

For example, .+</div> will match abc</div>efg in abc</div>efg</div>: each character in the </div> string can be matched by the dot . and the greedy quantifier eats up as much as possible.

What you want to do is either make it lazy, so that it matches the least amount possible, with +?:

</a> : (.+?)</div>

Or, if you know your text can't contain <, use [^<] (ie anything except a <) instead of a .: that way [^<]+ can't eat up </div>:

</a> : ([^<]+)</div>

Your regex was previously working because the dot . by default doesn't match newlines. On a side note, no need to escape everything in your regex...

Question 2

Try this way:

<?php

$string = '<ul> <li> <div> <a href="#"><strong>1</strong></a> : test1 </div> </li> <li> <div> <a href="#"><strong>2</strong></a> : test2 </div> </li> <li> <div> <a href="#"><strong>3</strong></a> : test3 </div> </li> </ul>';
$pattern = '#</a>\s*:\s*(.+?)</div>#';
preg_match_all($pattern, $string, $out);

print_r($out);
?>

Result:

Array
(
    [0] => Array
        (
            [0] =>  : test1 
            [1] =>  : test2 
            [2] =>  : test3 
        )

    [1] => Array
        (
            [0] => test1 
            [1] => test2 
            [2] => test3 
        )

)

The white space might be changed (space or tab) therefore, its better to use \s to match all white spaces even (\n or \r)

Question 3

</a>\s?+:\s?+(.*?)\s?+</div>

Regular expression visualization

Debuggex Demo