Domanda

I was trying to use what i have learned about file and resource handling in C++: I would like to write a diff-like utility.

Here It is my latest version

#include <iostream>
#include <cstdlib>
#include <fstream>

int main(int argc, char* argv[])
{
  if(argc!=3)
  {
    std::cout << "error: 2 arguments required, now exiting ..." << std::endl;
    exit (EXIT_FAILURE);
  }

  std::ifstream file_1(argv[1]);
  std::ifstream file_2(argv[2]);

  if( file_1.fail() || file_2.fail() )
  {
    std::cout << "error: can't open files, now exiting ..." << std::endl;
    exit (EXIT_FAILURE);
  }

  std::string dummy_1;
  std::string dummy_2;

  while(!file_1.eof()) // dummy condition
  {
    std::getline(file_1,dummy_1);
    std::getline(file_2,dummy_2);
    std::cout << ((dummy_1==dummy_2) ? "= " : "# ") << dummy_1 << std::endl << "  " << dummy_2 << std::endl;
  }

  return(0);
}

This are my guidelines:

  • compare 2 files
  • the user must pass the names of this 2 files directly to the executable, only this 2 arguments
  • to cover as much error handling as possible in C++
  • try to avoid platform specific steps or non-portable code

My actual problem is that i don't know how to improve my dummy condition effectively. For now the while iteration just follows the length of the first passed file and I would like to obiviously go all the way down in both files and solve this without introducing an overkill like an extra cicle to get and compare the length of this 2 files before doing the real comparison.

I also would like to know if my approach can be considered safe.

Eventually I could also accept answers proposing a solution with the boost libraries since they are quite portable and I already know that i will use them for other reasons.

Thanks.

È stato utile?

Soluzione

I started by writing rather a long comment on @Loki Astari's answer, but it's long enough (and, IMO, enough cleaner way to do the job) that it probably makes the most sense as an independent answer. In this case, you want something close to the standard loop, except that you keep reading as long as a read from one of the files succeeds. That being the case, @john is right, and it's best to avoid using eof() as part of the loop condition.

std::string line1, line2;
static const char *prefixes[] = {"#  ", "=  "};


while (std::getline(file_1, line1) || std::getline(file_2, line2)) std::cout << prefixes[line1==line2] << line1 << "\n " << line2 << "\n";

Edit: @user1802174 raised a good point -- as it was, the loop didn't actually read data in parallel at all. Since it was using || which does short-circuit evaluation, when/if the read from the first file succeeded, it didn't read anything from the second file. Fortunately, he was wrong about one thing: it is fairly easy to fix. At least in this case, + works fine, although we do have to explicitly cast the result to bool. I've also added a fix for the fact that upon failure, getline leaves the previous content of the string intact, so we need to explicitly clear the strings every iteration of the loop to get the desired behavior.

while (line1.clear(), line2.clear(), 
      (bool)std::getline(file_1, line1) + (bool)std::getline(file_2, line2))
{
    std::cout << prefixes[line1==line2] << line1 << "\n   " << line2 << "\n";
}

This time I did a quick test. File 1:

line1
line 2

File 2:

line 1
line 2
line 3

result:

#  line1
   line 1
=  line 2
   line 2
#
   line 3

While obviously still not a full-blown diff utility, I think this is doing what was intended.

As in @Loki Astari's answer, this will basically act as if the file with fewer lines was padded with as many empty lines at the end as necessary to match the longer file.

As an aside, also note the use of "\n" instead of std::endl. Along with inserting a new-line, std::endl also flushes the output buffer, which you almost certainly don't want in this case. Flushing the buffer still produces the correct results, but in many cases is likely to do so much more slowly.

Edit: As far as coding style goes, it probably is a bit better to write the loop as a for loop instead of a while:

for ( ; (bool)std::getline(file_1, line1) + (bool)std::getline(file_2, line2))
      ; line1.clear(), line2.clear())
{
    std::cout << prefixes[line1==line2] << line1 << "\n   " << line2 << "\n";
}

I personally see little real gain from using C++ style casts here. If I wanted to get away from using (bool), I'd probably use another well-known idiom (which, admittedly, many people also dislike):

for ( ; !!std::getline(file_1, line1) + !!std::getline(file_2, line2))
      ; line1.clear(), line2.clear())
{
    std::cout << prefixes[line1==line2] << line1 << "\n   " << line2 << "\n";
}

If somebody really objects to using a comma operator, this is easy to rewrite as:

while (!!std::getline(file_1, line1) + !!std::getline(file_2, line2))       
{
    std::cout << prefixes[line1==line2] << line1 << "\n   " << line2 << "\n";
    line1.clear();
    line2.clear();
}

Personally, I don't consider that an improvement, but others may disagree.

Altri suggerimenti

As john pointed out. Using eof() in the condition is usually wrong.

But in this case I think it is appropriate. But as a result you need to add some extra checks.

while(true)  // exit provided by break.
{
    std::string dummy_1;   // By declaring them here you force them to be 
    std::string dummy_2;   // reset each iteration.

    // Because you are doing the read inside the loop
    // You need to check if the reads work.
    if (!std::getline(file_1,dummy_1) && !std::getline(file_2,dummy_2))
    {
        // Only exit if both reads fail.
        break;
    }

    // Got here if at least one read worked.
    // A failed read will result in an empty line for comparison.    
    std::cout << ((dummy_1==dummy_2) ? "= " : "# ") << dummy_1 << std::endl << "  " << dummy_2 << std::endl;
}

As usual eof() is the wrong thing to do. This works

while (std::getline(file_1, dummy_1) && std::getline(file_2, dummy_2))
{
    ...
}

Suggest you read up on what eof() really does. it doesn't to what you think, but actually it will be useful in this program because you can use it the proper way, to tell which of your two files has hit the end of file. See here

You can actally use eof() properly in this program to find out which of the two files hit end of file. I would probably write your loop something like this

for (;;)
{
    getline(file_1, dummy_1);
    getline(file_2, dummy_2);
    if (file_1.eof() || file_2.eof())
        break;
    ...
}
if (file_1.eof() && file_2.eof())
{
    // both at end of file
}
else if (file_1.eof())
{
    // file 1 at end of file
}
else
{
    // file 2 at end of file
}

Notice though that the eof() test comes after the getline(), not before. That's how eof() is supposed to be used.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top