gunzip a file stream in R?

https://stackoverflow.com/questions/3128422

30-09-2019
|

Question

I'm trying to create an R API for StackOverflow. The output is gzipped. For example:

readLines("http://api.stackoverflow.com/0.9/stats/", warn=F)
[1] "\037‹\b"                                                                                                                                                                                                                                                                                         
[2] "\030\002úØÛy°óé½\036„iµXäË–[<üt—Zu[\\VmÎHî=ÜÛÝ¹×ýz’Í.äûû÷>ý´\a\177Ýh÷\017îÝÛÙwßÚáÿþ«¼þý\027ÅrÝæÔlgüÀëA±\017›ìŽï{M¤û.\020\037�Ë\"¿’\006³ì\032„Úß9¸ÿ`¼ç÷³*~ÿKêˆð¡\006v¦ð²ýô£�ñÃ�ì+ôU�_\026æ»½�]êt¼·?ÞûÈ4ù%\016~S0^>àe¶ÀG\037½n³éÛôKêç¼¬®‚\016Êê¢úý×u‰fó¶]=º{·aÎšŽ—y{·©î\026‹‹»h5^-/‚W1 |9[UÅ²õ^§�Ç"
[3] ":¬´¿1M\177ð\"0íö¹ñ…YÞLëbÕ*!~â\027\036§çU�®êê¢ÎˆµhòýæÅ´Zn\036S¶Z•ùv[§óm´î�"                                                                                                                                                                                                                      
[4] "Í™tËª^d¥£·üÂ?¾ÿ\033'¿$ù\177"

Is there a good way to gunzip this in R, short of writing the output to file, gunzip'ing it, and reading it back in?

Solution

You could do:

conn <- gzcon(url("http://api.stackoverflow.com/0.9/stats/"))
data <- readLines(conn)

OTHER TIPS

Try:

p <- gzcon(url("http://api.stackoverflow.com/0.9/stats/"))
readLines(p)

Ideally we should tell the server that we can handle gzipped content, find out from the HTTP headers that the content is actually gzip encoded and then decompress only if it is. The Rcurl library can do this:

library(Rcurl)
getURL("http://api.stackoverflow.com/0.9/stats/",
       .opts=list(encoding="identity,gzip")

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow