Edit: OK, this was not so easy after all :)
By the way, this library is really an excellent tool. Kudos to the guys who wrote it.
Here is one possible solution:
class menu_parse {
static $missing = array(); // list of missing files
static private $files = array(); // list of source files to process
// initiate menu parsing
static function start ($file)
{
// start with root file
self::$files[$file] = 1;
// parse all source files
for ($res=array(); current(self::$files); next(self::$files))
{
// get next file name
$file = key(self::$files);
// parse the file
if (!file_exists ($file))
{
self::$missing[$file] = 1;
continue;
}
$html = file_get_html ($file);
// get menu root (if any)
$root = $html->find("ul[id=menu]",0);
if ($root) self::menu ($root, $res);
}
// reorder missing files array
self::$missing = array_keys (self::$missing);
// that's all folks
return $res;
}
// parse a menu at a given level
static private function menu ($menu, &$res)
{
foreach ($menu->children as $elem)
{
switch ($elem->tag)
{
case "li" : // name and possibly source file of a menu
// grab menu name
$name = $elem->plaintext;
// see if we can find a link to the menu file
$link = $elem->children(0);
if ($link && $link->tag == 'a')
{
// found the link
$file = $link->href;
$res[$name]->file = $file;
// add the source file to the processing list
self::$files[$file] = 1;
}
break;
case "ul" : // go down one level to grab items of the current menu
self::menu ($elem, $res[$name]->childs);
}
}
}
}
Usage:
// The result will be an array of menus indexed by item names.
//
// Each menu will be an object with 2 members
// - file -> source file of the menu
// - childs -> array of menu subtitems
//
$res = menu_parse::start ("root.html");
// parse_menu::$missing will contain all the missing files names
echo "Result : <pre>";
print_r ($res);
echo "</pre><br>missing files:<pre>";
print_r (menu_parse::$missing);
echo "</pre>";
Ouput of your test case:
Array
(
[Start] => stdClass Object
(
[childs] => Array
(
[Sub1] => stdClass Object
(
[file] => file1.html
[childs] => Array
(
[SubSub1] => stdClass Object
(
[file] => file3.html
[childs] => Array
(
[SubSubSub1] => stdClass Object
(
[file] => file7.html
)
[SubSubSub2] => stdClass Object
(
[file] => file8.html
)
[SubSubSub3] => stdClass Object
(
[file] => file9.html
)
)
)
[SubSub2] => stdClass Object
(
[file] => file3.html
)
[SubSub3] => stdClass Object
(
[file] => file5.html
)
[SubSub4] => stdClass Object
(
[file] => file6.html
)
)
)
[Sub2] => stdClass Object
(
[file] => file2.html
)
)
[file] => root.html
)
)
missing files: Array
(
[0] => file2.html
[1] => file5.html
[2] => file6.html
[3] => file7.html
[4] => file8.html
[5] => file9.html
)
Notes:
- The code assumes all item names are unique inside a given menu.
You could modify the code to have the (sub)menus as an array with numeric indexes and names as properties (so that two items with the same name would not overwrite each other), but that would complicate the structure of the result.
Should such name duplication occur, the best solution would be to rename one of the items, IMHO.
- The code also assume there is only one root menu.
It could be modified to handle more than one, but that does not make much sense IMHO (it would mean a root menu ID duplication, which would likely cause trouble to the JavaScript trying to process it in the first place).