Question

I'm writing a script to read a large file(>10 GB) and write the data from an array to the end of each line in that file. Here is my code

   my $count=0;
   while(my $lines = <$FILE>){
        seek $FILE, length($lines), 1;
        print $FILE "\t", $array[$count];
        $count++;
        }

But I think I'm wrong in finding the end-of-line using seek method. I couldn't get my head around this. Can anyone please see whats wrong in this code. before processing..

my 1st line
my 2nd line
my 3rd line

After processing....

my 1st line data1
my 2nd line data2
my 3rd line data3

data1,data2,data3 are in the @array.

Details on the code:

  • FILE is opened in +< mode (read/write)
  • FILE lines are tab delimited.
  • @array holds the data1,2...

Issues:

  • Moving the pointer to end of each line

Thanks,

Robin

Was it helpful?

Solution

You can't do that. Seeking to a location in a file and then printing to it overwrites the data at that position.

I suggest you use Tie::File, which lets you access the contents of a file as an array, so appending to the end of a line of the file is done by simply adding a string to one of the elements of the array.

The code would look like this. Note that the line that creates @newdata is there just for testing. It creates an array that is the same length as the file, with lines like data1, data2 etc. as you have in your question.

You should test this on a smaller file initially as it will take a while to process the 15GB file, and it also overwrites it, so if you have any bugs you will destroy your data.

use strict;
use warnings;

use Tie::File;
use Fcntl 'O_RDWR';

tie my @file, 'Tie::File', 'myfile', mode => O_RDWR or die $!;

my @newdata = map sprintf('data%d', $_ + 1), 0 .. $#file;

my $i = 0;
for my $line (@file) {
  $line .= "\t" . $newdata[$i];
  ++$i;
}

untie @file;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top