UTF-8 / Unicode Text Encoding with RPostgreSQL

Question 1

After exporting to R it's shown as: "StÃ©phane" (the é is encoded as Ã©)

Your R environment is using a 1-byte non-composed encoding like latin-1 or windows-1252. Witness this test in Python, demonstrating that the utf-8 bytes for é, decoded as if they were latin-1, produce the text you see:

>>> print u"é".encode("utf-8").decode("latin-1")
Ã©

Either SET client_encoding = 'windows-1252' or fix the encoding your R environment uses. If it's running in a cmd.exe console you'll need to mess with the chcp console command; otherwise it's specific to whatever your R runtime is.

Question 2

As Craig Ringer said, setting client_encoding to windows-1252 is probably not the best thing to do. Indeed, if the data you're retrieving contains a single exotic character, you're in trouble:

Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not Retrieve the result : ERROR: character 0xcca7 of encoding "UTF8" has no equivalent in "WIN1252" )

On the other hand, getting your R environment to use Unicode could be impossible (I have the same problem as you with Sys.setlocale... Same in this question too.).

A workaround is to manually declare UTF-8 encoding on all your data, using a function like this one:

set_utf8 <- function(x) {
  # Declare UTF-8 encoding on all character columns:
  chr <- sapply(x, is.character)
  x[, chr] <- lapply(x[, chr, drop = FALSE], `Encoding<-`, "UTF-8")
  # Same on column names:
  Encoding(names(x)) <- "UTF-8"
  x
}

And you have to use this function in all your queries:

set_utf8(dbGetQuery(con, "SELECT myvar FROM mytable"))

EDIT: Another possibility is to use RPostgres unstead of RPostgreSQL. I tested it (with the same config as in your question), and as far as I can see all declared encodings are automatically set to UTF-8.

Question 3

If you use RPostgres::Postgres() as the first parameter of dbConnect() normally you will not have problem with encoding.

I tried this script where I had the same problem and now my accented characters are ok.

dbConnect(RPostgres::Postgres(),user="user",password="psw",host="host",port=5432,dbname="db_name")

Question 4

This will fix any Unicode/UTF-8 problems in Windows. It must be executed before querying the database.

postgresqlpqExec(con, "SET client_encoding = 'windows-1252'")

_{Drawn from asker's misplaced self-answer, visible in question revision history}

Question 5

Do it:

con <- dbConnect("...", encoding = "latin1")