Question

I thought this would be an easy task, after a couple of tries I try the tried and true write to a temp than reopen and rewrite:

#include <stdlib.h>
#include <stdio.h>
int main()
{
     FILE *f = fopen("main2.c","r");
     FILE *t = fopen("temp","w");
     int c;
     int count = 0;
     while((c = fgetc(f))!=EOF)
     {
          if(c)
          {
               fputc(c,t);
          }
          else
          {
               printf("null found\n");
          }
    }
    fclose(f);
    fclose(t);
    FILE *n = fopen("main2.c","w");
    FILE *w = fopen("temp","r");
    while((c=fgetc(w))!=EOF)
    {
          fputc(c,n);
    }
    fclose(n);
    fclose(w);
    return 0;
}

this just spits out a bunch of chinese characters. Could the underlying character encoding be the issue? Or am I just a total noob here?

My hex editor won't let me copy/paste. I don't know how I can get the file up here in its original condition so I have it zipped in google docs let me know immediately if you cant get it:

https://docs.google.com/open?id=0B4UPOuCR5uRGZzJQZUpVaktKYlk

EDIT: wait wait here it is via HxE Edit:

FF FE 23 00 69 00 6E 00 63 00 6C 00 75 00 64 00 65 00 20 00 3C 00 73 00 74 00 64 00   
6C  00 69 00 62 00 2E 00 68 00 3E 00 0D 00 0A 00 23 00 69 00 6E 00 63 00 6C 00 75 00 64 00 
65 00 20 00 3C 00 61 00 6C 00 6C 00 65 00 67 00 72 00 6F 00 2E 00 68 00 3E 00 0D 00 0A 00 
23 00 69 00 6E 00 63 00 6C 00 75 00 64 00 65 00 20 00 22 00 6D 00 6F 00
Was it helpful?

Solution

Odds are that you are removing NULL bytes because the input is UTF-16 Unicode. If so, you also must remove the byte-order mark (BOM) at the start of the file. If the first two bytes are 0xFF, 0xFE then you have a little-endian UTF-16 file. Discard them! If you leave them in, every pair of ASCII characters in your source will be treated as a combined 16-bit character code. Strangeness will ensue.

Likewise if the first two bytes are 0xFE, 0xFF, the file is big-endian UTF-16 and you must also delete those two bytes, else the file will be treated as 16-bit codes again, only with high bytes first.

OTHER TIPS

Open the files in binary mode:

 FILE *f = fopen("main2.c","rb");
 FILE *t = fopen("temp","wb");
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top