Question

Is it possible to split the contents of file into parts that have specific pattern?

This is what I want to achieve:

  • Read the file using file_get_contents
  • Read only contents between similar commented areas.

I am not sure how complicated is that but basically If I am parsing a large html file and want only to display to the browser the specific widgets (pattern is the comment boundaries) like this:

Sample:

<html>
<head>
   <title>test</title>
</head>
<body>
 this content should not be parsed.. ignored
 <!-- widget -->
 this is the widget. i want to parse this content only from the file
 <!-- widget -->
</body>
</html>

would it be possible using php and regex or anything to parse the contents between boundaries only?

I apologize but I tried to explain what I want to achieve as much as I can. hope someone helps me.

Was it helpful?

Solution

It's certainly possible, but it doesn't really need to be done with regex. I'd probably just do something like this:

$file = file_get_contents('http://example.com/');
$widgets = explode('<!-- widget -->', $file);

Now the odd elements of $widget ([1], [3], [5], etc) contain what was between those boundaries.

OTHER TIPS

You can achieve what you want with a regular expression (or if you are only ever splitting on you can probably just use that). Check the documentation. The other answer using explode() will probably also work.

$text = file_get_contents('/path/to/your/file');
$array = split('<!-- widget -->', $text);

The first entry will be everything before the first occurrence of <!-- widget --> and the last element will be everything after the last <!-- widget -->. Every odd-numbered element will be what you're looking for.

Php split function documentation

$pattern = "/<!-- widget -->([\s\S]+)<!-- widget -->/";
$match = preg_match_all($pattern,$string,$match_array);

var_dump($match_array);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top