Stdio, cin and cout: Programs for use in unix pipes (like grep, sort, etc)

Question 1

The standard idiom if there are no options is:

int returnCode = 0;

void
processFile( std::string const& filename )
{
    if ( filename == "-" ) {
        process( std::cin );
    } else {
        std::ifstream in( filename.c_str() );
        if ( !in.is_open() ) {
            std::cerr << argv[0] << ": cannot open " << filename << std::endl;
            returnCode = 1;
        } else {
            process( in );
        }
    }
}

int
main( int argc, char** argv )
{
    if ( argc == 1 ) {
        processFile( "-" );
    } else {
        for ( int i = 1; i != argc; ++ i ) {
            processFile( argv[i] );
        }
    }
    std::cout.flush()
    return std::cout ? returnCode : 2;
}

There are many variants, however. I found myself doing this so often that I wrote a MultiFileInputStream class whose (template> constructor takes a pair of iterators; it then executes more or less the same code as the above. (All of the significant code is, as usual, in the corresponding streambuf.) Similarly, I have a class to parse out the options (which looks like an immutable std::vector<std::string> once the options have been parsed. So the above would become:

int
main( int argc, char** argv )
{
    CommandLine& args = CommandLine::instance();
    args.parse( argc, argv );
    MultiFileInputStream src( args.begin(), args.end() );
    process( src );
    return ProgramStatus::instance().returnCode();
}

(ProgramStatus is another useful class, which handles error output, and the return code. And flushes std::cout and adjusts the error code when you call returnCode() on it.)

I'm sure that anyone writing Unix filter programs has developed similar classes.

With regards to question 2: sync_with_stdio is a static member of std::ios_base, so you can call it without an object: std::ios_base::sync_with_stdio( false );. I find this less misleading, since the call will affect all iostream objects. If the IO handling is a blocking point, by all means do it, but most of the time, I don't bother. It's rare for such programs to need any sort of optimization. (Note that if you do call sync_with_stdio, then you should not use any C style IO. But I can't see any reason to use it anyway.)

With regards to question 3: error messages go to std::cerr, always. You also want to be sure to return a non-zero return code, even if the error wasn't fatal. Something like:

myprog file1 > tmp && mv tmp file1

is all to common, and if you have some problem, and don't generate the output, it's a disaster if you don't return a non-zero error code. (That's why I always flush and then check the status of std::cout. A long, long time ago, a user of my program did the above, with a very large file, and the disk was full. It wasn't quite as full afterwards. Since then: always flush std::cout, and check that it worked, before returning OK.)

Question 2

Are you sure you want to use C++? Most operating systems rely more on C and assembly than C++. If you're going to write apps then C++ could be a good choice, but for operating system and its utilities, shell and helper programs, they're usually coded in C. You can look through your Linux or BSD implementation to see how it is done with pipes, standard input and standard output. If you think that C is something for you, you could read the C book "THe C programming language" by Kernighan and Richie, there you have many examples how to write a good C program that uses pipes, std i/o and arguments.