First of all: Be Careful!
This is potentially hairy thing with many possible exceptions. The solution I provide does:
- ... not use regexes, which should make the code more readable, maintainable, yada yada yada :)
- ... not check if a value contains pipes
|
, which will trip up this thing. On the other hand, a value may safely contain colons. - ... not deal with multi-byte characters.
- ... not care about performance.
- ... assume the key
"file"
is always present. - ... not insert missing keys, which should be dealt elsewhere in that case.
Take these notes into consideration before blindly copy/pasting! ;)
In addition, my solution contains the file-name in each element, which is redundant. But removing it would have made the solution messier without much gained value.
Here's a solution:
<?php
/**
* Parse a line of the file. Returns an associative array, using the part
* before the colon as key, the following part as value.
*
* @param $line A line of text.
*/
function parse_line($line) {
// split on each '|' character.
$fields = explode('|', $line);
$data = array();
foreach($fields as $field) {
// unpack key/value from each 'key: value' text. This will only split on
// the first ":", so the value may contain colons.
list($key, $value) = explode(':', $field, 2);
// remove surrounding white-space.
$key = trim($key);
$value = trim($value);
$data[$key] = $value;
}
return $data;
}
/**
* Parses a file in the specified format.
*
* Returns an associative array, where the key is a filename, and the value is
* an associative array of metadata.
*
* @param $fname The filename
*/
function parse_file($fname) {
$handle = fopen($fname, "r");
$lines = array();
if ($handle) {
while (($line = fgets($handle)) !== false) {
$data = parse_line($line);
$lines[$data["file"]] = $data;
}
} else {
// error opening the file.
}
return $lines;
}
var_dump(parse_file("testdata.txt"));