Discrepancy with fgetc while reading a text file

https://stackoverflow.com/questions/19541137

c
fgetc

01-07-2022
|

Question

I´m beginning with C and I´m willing to understand certain conditions.

I have a text file, generated by notepad or direct via shell by echo in a windows os.

When running this the output show extra chars. What I ´m doing wrong? How I can read text files in a secure way char by char?

Using codeblocks with minggw.

file.txt:

TEST

C program

void main()
{
   int i;
   FILE *fp;

   fp = fopen("file.txt","r");

   while ((i = fgetc(fp)) != EOF)
   {
      printf("%c",i);
   }
}

Output

■T E S T

Solution

Your code has issues, but the result is fine.

Your file is likely UTF-8 with a (confusingly enough) byte order mark in the beginning. Your program is (correctly) reading and printing the bytes of the BOM, which then appear in the output as strange characters before the proper text.

Of course, UTF-8 should never need a byte order mark (it's 8-bit bytes!), but that doesn't prevent some less clued-in programs from incuding one. Window's Notepad is the first program on the list of such programs.

UPDATE: I didn't consider the spacing between your letters, which of course indicate 16-bit input. That's your problem right there, then. Your C code is not reading wide characters.

OTHER TIPS

Try this code

void main()
{
   int c,i;
   FILE *fp;

   fp = fopen("file.txt","r");

   while ((i = fgetc(fp)) != EOF)
   {
     printf("%c",i);
   }
}'

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow