Question

I have the following program in C, which is intended to convert UNIX text files to Windows format (LF->CR LF). Basically the intended usage is addcr infile > outfile in the command line:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
    FILE *fp;
    char *buffer;
    int i, flen;
    if(argc<2)
    {
        printf("Usage: addcr filename\n");
        return 0;
    }
    fp=fopen(argv[1], "r");
    if(fp==NULL)
    {
        printf("Couldn't open %s.\n", argv[1]);
        return 0;
    }
    fseek(fp, 0, SEEK_END);
    flen=ftell(fp);
    rewind(fp);
    buffer=(char*)malloc(flen+1);
    fread(buffer, 1, flen, fp);
    fclose(fp);
    buffer[flen]=0;
    for(i=0;i < strlen(buffer);i++)
    {
        if(buffer[i]==0x10)
        {
            printf("%c", '\r');
        }
        printf("%c", buffer[i]);
    }

    free(buffer);
    return 0;
}

However, sometimes it prints out garbage at the end of the file contents, as indicated by comparing its output to the TYPE command:

C:\Temp>addcr sample.txt
He did not wear his scarlet coat,
                 For blood and wine are red,
               And blood and wine were on his hands
                 When they found him with the dead,
               The poor dead woman whom he loved,
                 And murdered in her bed.
Window
C:\Temp>type sample.txt
He did not wear his scarlet coat,
                 For blood and wine are red,
               And blood and wine were on his hands
                 When they found him with the dead,
               The poor dead woman whom he loved,
                 And murdered in her bed.

C:\Temp>

It appears to sometimes print out some unpredictable portion of a string in my Environment Variables. I have absolutely no clue what could be causing it. Does anyone know how to resolve this problem?

Was it helpful?

Solution

I think what's happening here is that the input file already has CRLF line delimiters and you have opened it in text mode. What happens then is you use fread, and it translates those to a line feed ('\n').

Since you asked for the file size first, that will be 6 bytes longer than the length of the translated text you read in. That means the last 6 bytes are uninitialised before you terminate the buffer at position flen.

fread will actually return the number of bytes read. You should pay attention to this value.

size_t bytes_read = fread(buffer, 1, flen, fp);

Try it. Output the value of bytes_read and the value of flen. I'll bet they're different. Also, you really don't have to terminate your buffer and use strlen to get the length. It's actually quite ugly to do that. You already know the length -- it's bytes_read. So use that in your loop.

If you want to avoid this confusion, you should open the file in binary mode -- "rb", not "r".

OTHER TIPS

Since stdout is working in text mode, if you're running this on a windows OS, you shouldn't be explicitly writing the '\r'. It'll translate '\n' to '\r' '\n' automatically (and do it in the proper order!).

buffer does not have a \0 at the end, so strlen(buffer) will keep counting until it happens to find a \0 - so it will return slightly more than the actual length of buffer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top