Question

I want to compare two files so I wrote the following code:

while($line1 = <FH1>){
     while($line2 = <FH2>){
         next if $line1 > $line2;
         last if $line1 < $line2;
     }
     next;
}

My question here is that when the outer loop comes to the next line of file1 and then goes to the inner loop, will the inner while statement read from the first line of file2 again or continue where it left off on the previous iteration of the outer loop?

Thanks

Was it helpful?

Solution

You should always use strict and use warnings at the start of all your programs and declare all variables at their point of first use. This applies especially when you are asking for help with your code.

Is all the data in your files numeric? If not then enabling warnings would have told you that that the < and > operators are for comparing numeric values rather than general strings.

Once a file has been read through completely - i.e. the second loop's while condition terminates - you can read no more data from the file unless you open it again or use seek to rewind to the beginning.

In general it is better in these circumstances to read the smaller of the two files into an array and use the data from there. If both files are very large then something special must be done.

What sort of file comparison are you trying to do? Are you making sure that the two files are identical, or that all data in the second file appears in the first, or something else? Please give an example of your two data files so that we can help you better.

OTHER TIPS

The inner while loop will consume all the content of the FH2 filehandle when you have read the first line from the FH1 handle. If I can intuit what you want to accomplish, one way to go about it would be to read from both handles in the same statement:

while ( defined($line1 = <FH1>) && defined($line2 = <FH2>) ) {
    # 'lt' is for string comparison, '<' is for numbers
    if ($line1 lt $line2) {
        # print a warning?
        last;
    }
}

The inner loop will continue from it's last known position in FH2 - if you want it to restart from the beginning of the file you need to put:

seek(FH2, SEEK_SET, 0);

before the inner while

Documentation for seek is here in perldoc

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top