Question

There's the available.packages() function to list all packages available on CRAN. Is there a similar function to find all available vignettes? If not how would I get a list of all vignettes and the packages they're associated with?

As a corner case to keep in mind the data.table package has 3 vignettes associated with it.

EDIT: Per Andrie's response I realize I wasn't clear. I know about the vignette function for finding all the available local vignettes, I'm after a way to get all the vignettes of all packages on CRAN.

Was it helpful?

Solution

I seem to recall looking at this in response to some SO question (can't find it now) and deciding that since the information isn't included in the output of available.packages(), nor in the result of applying readRDS to @CRAN/web/packages/packages.rds (a trick from Jeroen Ooms), I couldn't think of a non-scraping way to do it ...

Here's my scraper, applied to the first 100 packages (leading to 44 vignettes)

pkgs <- unname(available.packages()[, 1])[1:100]
vindex_urls <- paste0(getOption("repos"),"/web/packages/", pkgs, 
    "/vignettes/index.rds", sep = "")
getf <- function(x) {
      ## I think there should be a way to do this directly
      ## with readRDS(url(...)) but I can't get it to work
    suppressWarnings(
              download.file(x,"tmp.rds",quiet=TRUE))
    readRDS("tmp.rds")
}
library(plyr)
vv <- ldply(vindex_urls,
            .progress="text",
            function(x) {
                if (inherits(z <- try(getf(x),silent=TRUE),
                    "try-error")) NULL else z
            })
tmpf <- function(x,n) { if (is.null(x)) NULL else
                            data.frame(pkg=n,x) }
vframe <- do.call(rbind,mapply(tmpf,vv,pkgs))
rownames(vframe) <- NULL
head(vframe[,c("pkg","Title")])

There may be ways to clean this up/make it more compact, but it seems to work OK. Your scrape once/update occasionally strategy seems reasonable. Or if you wanted you could scrape daily (or weekly or whatever seems reasonable) and save/post the results somewhere publicly accessible, then include a function with that URL hard-coded in the package ... or even create a nicely formatted HTML table, with links, that the whole world could use (and then add Viagra ads to the page, and $$PROFIT$$ ...)

edit: wrapped both the download and the readRDS in a function, so I can wrap the whole thing in try

OTHER TIPS

The functions vignette() and browseVignettes() list all vignettes of packages installed on your machine.

vignette(package="data.table")

Vignettes in package ‘data.table’:

datatable-faq                         Frequently asked questions (source, pdf)
datatable-intro                       Quick introduction (source, pdf)
datatable-timings                     Timings of common tasks (source, pdf)

browseVignettes() is especially helpful since it creates a web page with hyperlinks:

browseVignettes(package="data.table")

Vignettes found by browseVignettes(package = "data.table")

Vignettes in package data.table

Frequently asked questions - PDF  R  LaTeX/noweb 
Quick introduction - PDF  R  LaTeX/noweb 
Timings of common tasks - PDF  R  LaTeX/noweb 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top