"hø"
is being marked as being in UTF-8 encoding when printed direct to the console. You can force it to be native using enc2native
and this problem disappears, however I am still working out why this is...
Encoding("hø")
# [1] "UTF-8"
.Internal( inspect( c( "a" , enc2native("hø") ) ) )
#@1081d60a0 16 STRSXP g0c2 [] (len=2, tl=0)
# @100af87d8 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
# @1081e3a08 09 CHARSXP g1c1 [MARK,gp=0x21] [cached] "hø"
enc2native("hø") %chin% names(df)
#[1] TRUE
On the Encoding
help page there is a lot of relevant info, I this would be relevant:
There are other ways for character strings to acquire a declared encoding apart from explicitly setting it (and these have changed as R has evolved). Functions scan, read.table, readLines, and parse have an encoding argument that is used to declare encodings, iconv declares encodings from its from argument, and console input in suitable locales is also declared. intToUtf8 declares its output as "UTF-8", and output text connections (see textConnection) are marked if running in a suitable locale. Under some circumstances (see its help page) source(encoding=) will mark encodings of character strings it outputs.
Update
Seems to me that anything in the basic ASCII character (character codes 0-127) set gets an "unknown"
encoding, and any characters outside of this get set to "UTF-8"
by default, including from the extended ASCII codes (character codes 128-255).