سؤال

i am using strip_tags to strip the tags from xml file and it works just fine when the array size is small but if the page is big it crashes always.here is my script which works for upto 100 values but crashes for bigger value

        preg_match_all("/<image:caption>.*?<\/image:caption>|<image:loc>.*?<\/image:loc>|<loc>.*?<\/loc>/", $str, $results);
         $arr = array_chunk(array_map('strip_tags', $results[0]), 1000);

        for($i=0;$i < 1000;$i++){
      for($j=0;$j < 1000;$j++){


      $output=$arr[$i][$j]. '</br>';


      echo $output;
        }

        }   

it will strip these values nicely but for bigger file it crashes.

      <urlset>

        <url><loc>/1366x768/citroen-ds-cabrio-auto-car-wallshark-com-228615.html</loc><image:image><image:loc>s/1366x768/citroen-ds/228615/citroen-ds-cabrio-auto-car-wallshark-com-228615.jpg</image:loc><image:caption>Citroen Ds Cabrio Auto Car Wallshark Com  Walpapers</image:caption></image:image></url>

          <url><loc>/1366x768/citroen-ds-cars-citro-n-cabrio-213157.html</loc><image:image><image:loc>s/1366x768/citroen-ds/213157/citroen-ds-cars-citro-n-cabrio-213157.jpg</image:loc><image:caption>Citroen Ds Cars Citro N Cabrio  Walpapers</image:caption></image:image></url>

          <url><loc>/1366x768/citroen-ds-citro-n-pictures-95569.html</loc><image:image><image:loc>s/1366x768/citroen-ds/95569/citroen-ds-citro-n-pictures-95569.jpg</image:loc><image:caption>Citroen Ds Citro N Pictures  Walpapers</image:caption></image:image></url>
        </urlset>
هل كانت مفيدة؟

المحلول

You can try this:

<pre><?php

$dom = new DOMDocument();
@$dom->load('Remotefile.xml');

$urls = $dom->getElementsByTagName('url');

foreach ($urls as $url) {
    $image = $url->getElementsByTagName('image')->item(0);
    $imageChildren = $image->childNodes;

    $result[] = array( 'loc' => $url->getElementsByTagName('loc')->item(0)->textContent,
                       'imgloc' => $imageChildren->item(0)->textContent,
                       'imgcap' => $imageChildren->item(1)->textContent);
}

$stmt = $dbh->prepare ("INSERT INTO urls (loc, imageloc, imagecap) VALUES (:loc, :imgloc, :imgcap)");

foreach ($result as $res) {
    $stmt -> bindParam(':loc',    $res['loc']);
    $stmt -> bindParam(':imgloc', $res['imgloc']);
    $stmt -> bindParam(':imgcap', $res['imgcap']);
    $stmt -> execute();
}

A regex way:

$pattern = <<<'LOD'
~
  <url>                                                \s*+
  <loc>           (?<loc>    [^<]++ ) </loc>           \s*+
  <image:image>                                        \s*+
  <image:loc>     (?<imgloc> [^<]++ ) </image:loc>     \s*+
  <image:caption> (?<imgcap> [^<]++ ) </image:caption> \s*+
  </image:image>                                       \s*+
  </url>
~x
LOD;

preg_match_all($pattern, $str, $matches, PREG_SET_ORDER);

/* this foreach part is only for cosmetic and is totally useless */
foreach($matches as &$match) {
    foreach($match as $k=>$m) {
        if (is_numeric($k)) unset($match[$k]);
    }
}
print_r($matches);
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top