Вопрос

I have some C code that should print the entire contents of a file. The program previously prints a file just fine, but when it prints a second I keep seeing a Unicode character where there definitely should not be.

int c = fgetc(file);
putchar((!isprint(c) ? : c));

(wrapped in a while(!feof(file)))
Should only print ASCII printable characters, unless I'm mistaken. Regardless, the first thing it prints is \357\277\275, which isn't ASCII, and isn't printable.

The file contains only this: foo+bar.foo+t-bar.foo+completely fake

and it prints this: �foo+bar.foo+t-bar.foo+completely fake (with a newline between the strange character and the rest).

Simply printing it all (a la putchar(c)) puts the exact same character at the end of the line.

I've even tried using another file (by renaming the old and using a soft link to another), but I get the exact same results.

It also does this if the file is empty.

The file is completely plain text, created with vim, and there isn't anything special about it.

Here's the original code:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <errno.h>

int main(void)
{
    char *headp = "../include/header";
    char *listp = "../.piclist";
    FILE *head, *list;

    puts("Content-Type: text/html; charset=utf-8\nExpires: 0\n");

    puts("<!DOCTYPE html>\n<html lang='en'>\n<head>");
    puts("\t<title>Foo</title>");
    puts("\t<link rel='stylesheet' href='/css/main.css' />");
    puts("\t<link rel='stylesheet' href='/css/foo.css' />");
    puts("</head>\n<body>");

    head = fopen(headp, "r");
    if (errno) {
            perror("cannot open include/header");
            errno = 0;
    } else {
            while (!feof(head)) putchar(fgetc(head));
            putchar('\n');
    fclose(head);
    }

    list = fopen(listp, "r");
    if (errno) perror("cannot open .piclist");
    else {
    while (!feof(list)) {
            while (!feof(list)) {
                    int c = fgetc(list);
                    putchar((!isprint(c) ? : c));
            }
    }
    fclose(list);
    } /* else */

    fputs("\n<footer>\n\t<hr />\n\t<p>Copyright 2011-2012 the ", stdout);
    fputs("<a href='mailto:foo@example.com'> ", stdout);
    fputs("Foo Bar of Baz</a> of ", stdout);
    fputs("<a href='http://blah.org'>Blah United ", stdout);
    fputs("</a></p>\n</footer>\n</body>\n</html>\n",stdout);

    return 0;
}
Это было полезно?

Решение

DONT USE feof() (at least not the way you use it)

Also, your usage of errno is wrong. errno only contains a usable value after some operation indicated an error return (mostly by NULL, or -1 as a return value)

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <errno.h>

int main(void)
{
    char *headp = "../include/header";
    char *listp = "../.piclist";
    FILE *head, *list;
    int ch;

    puts("Content-Type: text/html; charset=utf-8\nExpires: 0\n");

    puts("<!DOCTYPE html>\n<html lang='en'>\n<head>");
    puts("\t<title>Warrenton Latin School | Gallery</title>");
    puts("\t<link rel='stylesheet' href='/css/main.css' />");
    puts("\t<link rel='stylesheet' href='/css/gallery.css' />");
    puts("</head>\n<body>");

    head = fopen(headp, "r");
    if (!head) {
            perror("cannot open include/header");
            errno = 0;
    } else {
        while (1) {
            ch = fgetc(head);
            if (ch == EOF) break;
            putchar(ch);
            }
        putchar('\n');
        fclose(head);
    }

    list = fopen(listp, "r");
    if (!list) perror("cannot open .piclist");
    else while (1) {
        ch = fgetc(list);
        if (ch == EOF) break;
        putchar((!isprint(c) ? : c));
    }
    fclose(list);

    fputs("\n<footer>\n\t<hr />\n\t<p>Copyright 2011-2012 the ", stdout);
    fputs("<a href='mailto:warrentonlatinschool@gmail.com'> ", stdout);
    fputs("Warrenton Latin School</a> co-op of ", stdout);
    fputs("<a href='http://warrentonumc.org'>Warrenton United ", stdout);
    fputs("Methodist Church</a></p>\n</footer>\n</body>\n</html>\n",stdout);

    return 0;
}

Другие советы

Ignoring possible mistakes in your code isprint() considers all characters printable, except 0x00 - 0x1f and 0x7f.

Things like UTF BOM and other characters outside 7-bit ANSI can still be printed (despite the fact their meaning might change depending on the encoding).

When you leave the 2nd operator of ?: empty, it equals to the result of the conditional. For non-printable characters, isprintc(c) returns 0, therefore the conditional part of the trenary operator is !0 which equals to 1. Therefore putchar attempts to print an invalid ASCII character and breaks.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top