preg_match_all not finding all the patterns in a large text file

https://stackoverflow.com/questions/21368679

03-10-2022
|

Question

I'm trying to build a FaH stats scraper. Every hour the newly update stats list is pulled to my server via cron and wget into this file http://chrislabs.info/statsFile.txt. This is opened in the script as $page using file_get_contents.

Then, for a list of unique team numbers (the fourth column), I'm trying to regex all the rows containing that team number using the code below:

foreach($teamArr as $team){
    $pattern = "/(.*[ascii])\t([0-9]*)\t.*[0-9]\t$team/";
    preg_match_all($pattern, $page, $matches);
    echo "<pre>";
    print_r($matches);
    echo "</pre>";

However, this isn't finding all the matches in $page and I'm at a loss now as to what to fix. I've changed the pcre.* INI settings to go up to 1GB.

You can look at the output here http://chrislabs.info/FoldingStats_MYSQL.php

Solution

Try to use this:

$pattern = '~^(?:\S++\t){3}' . $team . '$~m';

Your pattern didn't find all matches because you added [ascii] that only matches the letter a or the letter s or the letter c or the letter i. If you want to match all letters you must use [a-z]

An other way is to use fgetcsv and remove all records that are not from the team you are looking for.

You can also use two explode() the first with \n and the second with \t and check $item[3] for your team.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow