Question

I want to develop website with slight automated process or header, menu, navigation bar, footer etc, which uses markdown technique.

for example a navigationbar.md will contain only link text and link address, i want get those details individually (link and text without parsed html format) into variables or parameters in php.

* [Dog][0]
* [German Shepherd][1]
* [Belgian Shepherd][2]
    * [Malinois][3]
    * [Groenendael][4]
    * [Tervuren][5]
* [Cat][6]
    * [Siberian][7]
    * [Siamese][8]

[0]:(http://google.com)
[1]:(http://google.com)
[2]:(http://google.com)
[3]:(http://google.com)
[4]:(http://google.com)
[5]:(http://google.com)
[6]:(http://google.com)
[7]:(http://google.com)
[8]:(http://google.com)

if don't need any html here id want want nested array contain link text and link address

this structure of markdown will create html output as follow

enter image description here

but i need those list as nested array to perform defined tasks.

let me know if this works.. is their any chance for it

expected output

      array (size=9)
      0 => 
        array (size=2)
          0 => string 'Dog' (length=3)
          1 => string 'http://google.com' (length=17)
      1 => 
        array (size=2)
          0 => string 'German Shepherd' (length=15)
          1 => string 'http://yahoo.com' (length=16)
      2 => 
        array (size=2)
          0 => string 'Belgian Shepherd' (length=16)
          1 => string 'http://duckduckgo.com' (length=21)
          2 => 
            array (size=2)
              0 => string 'Malinois' (length=8)
              1 => string 'http://amazon.com' (length=17)
              2 => 
                array (size=2)
                  0 => string 'Groenendael' (length=11)
                  1 => string 'http://metallica.com' (length=20)
              3 => 
                array (size=2)
                  0 => string 'Tervuren' (length=8)
                  1 => string 'http://microsoft.com' (length=20)
      3 => 
        array (size=2)
          0 => string 'Cat' (length=3)
          1 => string 'http://ibm.com' (length=14)
          2 => 
            array (size=2)
              0 => string 'Siberian' (length=8)
              1 => string 'http://apple.com' (length=16)
          3 => 
            array (size=2)
              0 => string 'Siamese' (length=7)
              1 => string 'http://stackoverflow.com' (length=24)
Was it helpful?

Solution

This should work. I have provided all the explanation in the comments in the code. This works -

/**
    This function takes a strings- $text and $links_text.
    For each text value that matches the regular expression, the link
    from the $links_text is extracted and given as output.
    This returns an array consisting of the text mapped to their links.
    It will return a single array if there only single text value, and 
    a nested array if more than one text is found.
    Eg: 
    INPUT:
        var_dump(text_link_map("* [Dog][0]", "[0]:(http://google.com)[1]:(http://yahoo.com)"));
    OUTPUT: 
        array
          0 => string 'Dog' (length=3)
          1 => string 'http://google.com' (length=17)
*/
function text_link_map($text, $links_text){
    $regex= "/\*\s+\[([a-zA-Z0-9\-\_ ]+)\]\[([0-9]+)\]/";
    if(preg_match_all($regex, $text, $matches)){
        $link_arr = Array();
        /*
            For each of those indices, find the appropriate link.
        */
        foreach($matches[2] as $link_index){
            $links = Array();
            $link_regex = "/\[".$link_index."\]\:\((.*?)\)/";
            if(preg_match($link_regex,$links_text,$links)){
                $link_arr[] = $links[1];
            }
        }
        if(count($matches[1]) == 1){
            return Array($matches[1][0], $link_arr[0]);
        }else{
            $text_link = array_map(null, $matches[1], $link_arr);
            return $text_link;
        }
    }else{
        return null;
    }
}

/**
    Function that calls recursive index, and returns it's output.
    This is is needed to pass initial values to recursive_index.
*/
function indent_text($text_lines, $links){
    $i = 0;
    return recursive_index($i, 0, $text_lines, $links);
}


/**
    This function creates a nested array out of the $text.
    Each indent is assumed to be a single Tab.It is dictated by the
    $indent_symbol variable.
    This function recursively calls itself when it needs to go from 
    one level to another.
*/
function recursive_index(&$index, $curr_level, $text, $links){
    $indent_symbol = "\t";
    $result = Array();
    while($index < count($text)){
        $line = $text[$index];
        $level = strspn($line, $indent_symbol);
        if($level == $curr_level){
            $result[] = text_link_map($line, $links);
        }elseif($level > $curr_level){
            $result[count($result) - 1][] = recursive_index($index, $curr_level + 1, $text, $links);
            if($index > count($text)){
                break;
            }else{
                $index--;
            }               
        }elseif($level < $curr_level){
            break;
        }
        $index += 1;
    }
    return $result;
}   

$file_name = "navigationbar.md";
$f_contents = file_get_contents($file_name);
//Separate out the text and links part.
//(Assuming the text and the links will always be separated with 2 \r\n)
list($text, $links) = explode("\r\n\r\n", $f_contents);
//Get the nested array.
$formatted_arr = indent_text(explode("\r\n", $text), $links);
var_dump($formatted_arr);

This is the output of the code. It matches your requirements -

/*
    OUTPUT
*/
array(4) {
  [0]=>
  array(2) {
    [0]=>
    string(3) "Dog"
    [1]=>
    string(17) "http://google.com"
  }
  [1]=>
  array(2) {
    [0]=>
    string(15) "German Shepherd"
    [1]=>
    string(16) "http://yahoo.com"
  }
  [2]=>
  array(3) {
    [0]=>
    string(16) "Belgian Shepherd"
    [1]=>
    string(21) "http://duckduckgo.com"
    [2]=>
    array(3) {
      [0]=>
      array(2) {
        [0]=>
        string(8) "Malinois"
        [1]=>
        string(17) "http://amazon.com"
      }
      [1]=>
      array(2) {
        [0]=>
        string(11) "Groenendael"
        [1]=>
        string(20) "http://metallica.com"
      }
      [2]=>
      array(2) {
        [0]=>
        string(8) "Tervuren"
        [1]=>
        string(20) "http://microsoft.com"
      }
    }
  }
  [3]=>
  array(3) {
    [0]=>
    string(3) "Cat"
    [1]=>
    string(14) "http://ibm.com"
    [2]=>
    array(2) {
      [0]=>
      array(2) {
        [0]=>
        string(8) "Siberian"
        [1]=>
        string(16) "http://apple.com"
      }
      [1]=>
      array(2) {
        [0]=>
        string(7) "Siamese"
        [1]=>
        string(24) "http://stackoverflow.com"
      }
    }
  }
}

To check, the contents of navigationbar.md is -

* [Dog][0]
* [German Shepherd][1]
* [Belgian Shepherd][2]
    * [Malinois][3]
    * [Groenendael][4]
    * [Tervuren][5]
* [Cat][6]
    * [Siberian][7]
    * [Siamese][8]

[0]:(http://google.com)
[1]:(http://yahoo.com)
[2]:(http://duckduckgo.com)
[3]:(http://amazon.com)
[4]:(http://metallica.com)
[5]:(http://microsoft.com)
[6]:(http://ibm.com)
[7]:(http://apple.com)
[8]:(http://stackoverflow.com)

Certain assumptions in the code -

  • The part separating the text(i.e the "* [Dog][0]" part) and the link part(i.e the "[0]:(http://google.com)") are assumed to always be separated by 2 newlines.
  • Each parent child differ from a single Tab("\t").

You can test by changing the tabs between text in navigationbar.md.

Hope it helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top