perl regex using too much memory?

Question 1

This has nothing to do with regexes.

If you are operating in a memory-constrained environment, you should process data records one at a time rather than fetching all of them at once. Let's assume you pull your data like:

my @data = `some command`;
for my $line (@data) {
    ... # process the line
}

This is incredibly wasteful because you need storage for the data, and for the output of your processing (in your case: the hash).

Instead, process the input line by line. We can use the open function instead of backticks for this:

open my $cmd, '-|', 'some', 'command' or die "Can't run some command: $!";
while (my $line = <$cmd>) {
    ... # process the line
}

There is no need for an array here, which saves us 13MB of memory which we can now put to use otherwise.

Question 2

What problem are you really trying to solve? Use your words... not Perl.

Something like: "The script is picking apart the output from an openvms Directory output command and the objective is to report the number of file and oldest date ordered by directory"

First question is WHY keep the array. Will the script 'walk' it again? If not, just processes there and then in a for loop.

The regex seems to pick out out a file-name, and date. That's been does before. It is not hard, and can be simplified by trusting the OpenVMS directory format. Somethign like this reads better imho:

if($new_element =~ m|](.*);\d+\s+(\d+)-(\w+)-(\d+)\s+(\d+):(d+):(\d+)|)

: $hash_var{arr[0]} =

Hmmm, that suggests to me that a whole line from array is used as a key value, with all 50+ spaces. So those 10,000 lines tuning into 1,000,000+ bytes just for raw key bytes. A lot but not crazy. New we know that the first word on the line MUST be unique, why not exploit that: $hash_var{$1} = xxx if /(\S+)/l;

The program may also want to exploit that the leading strings are highly repetitive, and substitute everything before the "]" with an ever increasing directory number, maintained in a 'look-a-side' array and/or hash.

Personally I would drop /NOHEAD from the command, and use a regex to pick up the directories as they come by on their own lines.

Or use a SUBSTR or whatever... of course you'd need to construct a similar key on re-access.

In the related topic, there is debugging output printed. Perhaps include the line number in the array for your own understanding?

Perl encounters "out of memory" in openvms system

Good luck! Hein