Question

I wish to go through a package and discover who are the authors mentioned for each function's help file.

I looked for a function to extract elements from R's help file, and could find one. The closest I could find is this post, from Noam Ross.

Does such a function exist? (if not, I guess I'll hack Noam's code in order to parse the Rd file, and locate the specific element I'm interested in).

Thanks, Tal.

Potential code example:

get_field_from_r_help(topic="lm", field = "Description") #
# output:

‘lm’ is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although ‘aov’ may provide a more convenient interface for these).

Was it helpful?

Solution

This document by Duncan Murdoch on parsing Rd files will be helpful, as will this SO post.

From these, you could probably try something like the following:

getauthors <- function(package){
    db <- tools::Rd_db(package)
    authors <- lapply(db,function(x) {
        tags <- tools:::RdTags(x)
        if("\\author" %in% tags){
            # return a crazy list of results
            #out <- x[which(tmp=="\\author")]
            # return something a little cleaner
            out <- paste(unlist(x[which(tags=="\\author")]),collapse="")
        }
        else
            out <- NULL
        invisible(out)
        })
    gsub("\n","",unlist(authors)) # further cleanup
}

We can then run this on a package or two:

> getauthors("knitr")
                                                                                     d:/RCompile/CRANpkg/local/3.0/knitr/man/eclipse_theme.Rd 
                                                                                                                     "  Ramnath Vaidyanathan" 
                                                                                         d:/RCompile/CRANpkg/local/3.0/knitr/man/image_uri.Rd 
                                                                                                                    "  Wush Wu and Yihui Xie" 
                                                                                      d:/RCompile/CRANpkg/local/3.0/knitr/man/imgur_upload.Rd 
                                                                              "  Yihui Xie, adapted from the imguR package by Aaron  Statham" 
                                                                                          d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2pdf.Rd 
                                                                                         "  Ramnath Vaidyanathan, Alex Zvoleff and Yihui Xie" 
                                                                                           d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2wp.Rd 
                                                                                                          "  William K. Morris and Yihui Xie" 
                                                                                        d:/RCompile/CRANpkg/local/3.0/knitr/man/knit_theme.Rd 
                                                                                                       "  Ramnath Vaidyanathan and Yihui Xie" 
                                                                                     d:/RCompile/CRANpkg/local/3.0/knitr/man/knitr-package.Rd 
                                                                                                            "  Yihui Xie <http://yihui.name>" 
                                                                                        d:/RCompile/CRANpkg/local/3.0/knitr/man/read_chunk.Rd 
                      "  Yihui Xie; the idea of the second approach came from  Peter Ruckdeschel (author of the SweaveListingUtils  package)" 
                                                                                       d:/RCompile/CRANpkg/local/3.0/knitr/man/read_rforge.Rd 
                                                                                                          "  Yihui Xie and Peter Ruckdeschel" 
                                                                                           d:/RCompile/CRANpkg/local/3.0/knitr/man/rst2pdf.Rd 
                                                                                                               "  Alex Zvoleff and Yihui Xie" 
                                                                                              d:/RCompile/CRANpkg/local/3.0/knitr/man/spin.Rd 
"  Yihui Xie, with the original idea from Richard FitzJohn  (who named it as sowsear() which meant to make a  silk purse out of a sow's ear)" 

And maybe tools:

> getauthors("tools")
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/bibstyle.Rd 
                                                                "  Duncan Murdoch" 
                   D:/murdoch/recent/R64-3.0/src/library/tools/man/checkPoFiles.Rd 
                                                                "  Duncan Murdoch" 
                        D:/murdoch/recent/R64-3.0/src/library/tools/man/checkRd.Rd 
                                                  "  Duncan Murdoch, Brian Ripley" 
                     D:/murdoch/recent/R64-3.0/src/library/tools/man/getDepList.Rd 
                                                                   " Jeff Gentry " 
                      D:/murdoch/recent/R64-3.0/src/library/tools/man/HTMLlinks.Rd 
                                                    "Duncan Murdoch, Brian Ripley" 
            D:/murdoch/recent/R64-3.0/src/library/tools/man/installFoundDepends.Rd 
                                                                     "Jeff Gentry" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/makeLazyLoading.Rd 
                                                   "Luke Tierney and Brian Ripley" 
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/parse_Rd.Rd 
                                                                " Duncan Murdoch " 
                     D:/murdoch/recent/R64-3.0/src/library/tools/man/parseLatex.Rd 
                                                                  "Duncan Murdoch" 
                        D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2HTML.Rd 
                                                  "  Duncan Murdoch, Brian Ripley" 
                 D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2txt_options.Rd 
                                                                  "Duncan Murdoch" 
                   D:/murdoch/recent/R64-3.0/src/library/tools/man/RdTextFilter.Rd 
                                                                "  Duncan Murdoch" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/SweaveTeXFilter.Rd 
                                                                  "Duncan Murdoch" 
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/texi2dvi.Rd 
                     "  Originally Achim Zeileis but largely rewritten by R-core." 
                  D:/murdoch/recent/R64-3.0/src/library/tools/man/tools-package.Rd 
"  Kurt Hornik and Friedrich Leisch  Maintainer: R Core Team R-core@r-project.org" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteDepends.Rd 
                                                                   " Jeff Gentry " 
                 D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteEngine.Rd 
                                            "Duncan Murdoch and Henrik Bengtsson." 
                  D:/murdoch/recent/R64-3.0/src/library/tools/man/writePACKAGES.Rd 
                                                        "  Uwe Ligges and R-core."

Some functions have no author field, so this just drops those when it calls unlist at the end of getauthors, but the code could be modified slightly to return NULL values for those.

Also, further parsing is going to become a little bit difficult because package authors seem to use this field in very different ways. There's only one author field in devtools. There are a bunch in car, each of which contains an email address. Etc, etc. But this gets you to the available info, which you should be able to work with further.

Note: My previous version of this answer provided a solution if you have the full path of an Rd file, but didn't work if you were trying to do this for an installed package. Following Tyler's advice, I've worked out a more complete solution.

OTHER TIPS

This is my approach using some suggestions made by others:

package <- "qdap"
funs <- unclass(lsf.str(envir = asNamespace(package)))

out <- sapply(funs, function(x) {
    x <- try(capture.output(tools:::Rd2txt(utils:::.getHelpFile(as.character(help(x, help_type="text"))))))
    Auth_lines <- grep("_\bA_\bu_\bt_\bh_\bo_\br(_\bs):", x, fixed = TRUE) 
    if (identical(Auth_lines, integer(0))) {
        return(NA)
    }
    gsub("^\\s+|\\s+$", "", x[Auth_lines +2])
})

## To look at just the ones with author fields:
out[!sapply(out, is.na)]

## > out[!sapply(out, is.na)]
##                                                         beg2char 
##                   "Josh O'Brien, Justin Haynes and Tyler Rinker" 
##                                                         bracketX 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                    bracketXtract 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                         char2end 
##                   "Josh O'Brien, Justin Haynes and Tyler Rinker" 
##                                                 cm_df.transcript 
## "DWin, Gavin Simpson and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                            gantt 
##           "DigEmAll (<URL: stackoverflow.com>) and Tyler Rinker" 
##                                                       gantt_wrap 
##     "Andrie de Vries and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                             genX 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                        genXtract 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                             hash 
##      "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                         name2sex 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                  read.transcript 
##      "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                      sentCombine 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                        sentSplit 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                              TOT 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                          v.outer 
##   "Vincent Zoonekynd and Tyler Rinker <tyler.rinker@gmail.com>." 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top