Question

I am writing a number of related documents in rmarkdown that I will be compiling into a website through jekyll. In the course of doing this, I have run into a problem:

Some of the Rmd files I am using call on other Rmd files as child documents. When I render with knitr, the resulting document contains the yaml front-matter from the parent and child documents. An example is given below.

As yet, I don't see any way to specify only portions of a child document when that document is an Rmd. Does any one know of a method by which I could strip the yaml out of the child documents when they are read into the parent Rmd during knit()?

I'd be happy to consider answers outside of R, preferably something I can embed in a rakefile. I'd prefer not to alter the child documents permanently, though. So stripping out the yaml can't be permanent. Lastly, the yaml varies in length from file to file, so i'm guessing that any solution needs to be able to locate the yaml beginning and end by regex/grep/sed/etc...

EXAMPLE:

%%%% Parent_Doc.rmd %%%%

 ---
 title: parent doc
 layout: default 
 etc: etc
 ---
 This is the parent...

 ```{r child import, child="./child_doc."}
 ```

%%%% child_doc.rmd %%%%

 ---
 title: child doc
 layout: default 
 etc: etc
 ---

 lorem ipsum etc

%%%% output.md %%%%

 ---
 title: parent doc
 layout: default 
 etc: etc
 ---
 This is the parent...
 ---
 title: child doc
 layout: default 
 etc: etc
 ---

 lorem ipsum etc

%%%% Ideal Output.md %%%%

 ---
 title: parent doc
 layout: default 
 etc: etc
 ---
 This is the parent...

 lorem ipsum etc
Was it helpful?

Solution

In the mean time, maybe the following will work for you; it is kind of an ugly and inefficient work-around (I am new to knitr and am not a real programmer), but it achieves what I believe you are wanting to do.

I had written a function for a similar personal use that includes the following relevant bit; the original is in Spanish, so I've translated it some below:

extraction <- function(matter, escape = FALSE, ruta = ".", patron) {

  require(yaml)

  # Gather together directory of documents to be processed

  doc_list <- list.files(
    path = ruta,
    pattern = patron,
    full.names = TRUE
    )

  # Extract desired contents

  lapply(
    X = doc_list,
    FUN = function(i) {
      raw_contents <- readLines(con = i, encoding = "UTF-8")

      switch(
        EXPR = matter,

        # !YAML (e.g., HTML)

        "no_yaml" = {

          if (escape == FALSE) {

            paste(raw_contents, sep = "", collapse = "\n")

          } else if (escape == TRUE) {

            require(XML)
            to_be_escaped <- paste(raw_contents, sep = "", collapse = "\n")
            xmlTextNode(value = to_be_escaped)

          }

        },

        # YAML header and Rmd contents

        "rmd" = {
          yaml_pattern <- "[-]{3}|[.]{3}"
          limits_yaml <- grep(pattern = yaml_pattern, x = raw_contents)[1:2]
          indices_yaml <- seq(
            from = limits_yaml[1] + 1,
            to = limits_yaml[2] - 1
            )
          yaml <- mapply(
            FUN = function(i) {yaml.load(string = i)},
            raw_contents[indices_yaml],
            USE.NAMES = FALSE
            )
          indices_rmd <- seq(
            from = limits_yaml[2] + 1,
            to = length(x = raw_contents)
            )
          rmd<- paste(raw_contents[indices_rmd], sep = "", collapse = "\n")
          c(yaml, "contents" = rmd)
        },

        # Anything else (just in case)

        {
          stop("Matter not extractable")
        }

      )

    }
    )

}

Say my main Rmd document main.Rmd lives in my_directory and my child documents, 01-abstract.Rmd, 02-intro.Rmd, ..., 06-conclusion.Rmd are housed in ./sections; note that for my amateur function it is best to have the child documents saved in the order they will be summoned into the main document (see below). I have my function extraction.R in ./assets. Here is the structure of my example directory:

.
+--assets
|  +--extraction.R
+--sections
|  +--01-abstract.Rmd
|  +--02-intro.Rmd
|  +--03-methods.Rmd
|  +--04-results.Rmd
|  +--05-discussion.Rmd
|  +--06-conclusion.Rmd
+--stats
|  +--analysis.R
+--main.Rmd

In main.Rmd I import my child documents from ./sections:

---
title: Main
author: me
date: Today
output:
  html_document
---

```{r, 'setup', include = FALSE}
opts_chunk$set(autodep = TRUE)
dep_auto()
```

```{r, 'import_children', cache = TRUE, include = FALSE}
source('./assets/extraction.R')
rmd <- extraction(
  matter = 'rmd',
  ruta = './sections',
  patron = "*.Rmd"
  )
```

# Abstract

```{r, 'abstract', echo = FALSE, results = 'asis'}
cat(x = rmd[[1]][["contents"]], sep = "\n")
```

# Introduction

```{r, 'intro', echo = FALSE, results = 'asis'}
cat(x = rmd[[2]][["contents"]], sep = "\n")
```

# Methods

```{r, 'methods', echo = FALSE, results = 'asis'}
cat(x = rmd[[3]][["contents"]], sep = "\n")
```

# Results

```{r, 'results', echo = FALSE, results = 'asis'}
cat(x = rmd[[4]][["contents"]], sep = "\n")
```

# Discussion

```{r, 'discussion', echo = FALSE, results = 'asis'}
cat(x = rmd[[5]][["contents"]], sep = "\n")
```

# Conclusion

```{r, 'conclusion', echo = FALSE, results = 'asis'}
cat(x = rmd[[6]][["contents"]], sep = "\n")
```

# References

I then knit this document and only the contents of my child documents are incorporated thereinto, e.g.:

---
title: Main
author: me
date: Today
output:
  html_document
---





# Abstract


This is **Child Doc 1**, my abstract.

# Introduction


This is **Child Doc 2**, my introduction.

- Point 1
- Point 2
- Point *n*

# Methods


This is **Child Doc 3**, my "Methods" section.

|    method 1   |    method 2   |   method *n*   |
|---------------|---------------|----------------|
| fffffffffffff | fffffffffffff | fffffffffffff d|
| fffffffffffff | fffffffffffff | fffffffffffff d|
| fffffffffffff | fffffffffffff | fffffffffffff d|

# Results


This is **Child Doc 4**, my "Results" section.

## Result 1

```{r}
library(knitr)
```

```{r, 'analysis', cache = FALSE}
source(file = '../stats/analysis.R')
```

# Discussion


This is **Child Doc 5**, where the results are discussed.

# Conclusion


This is **Child Doc 6**, where I state my conclusions.

# References

The foregoing document is the knitted version of main.Rmd, i.e., main.md. Note under ## Result 1 that in my child document, 04-results.Rmd, I sourced an external R script, ./stats/analysis.R, which is now incorporated as a new knitr chunk in my knitted document; consequently, I now need to knit the document again.

When child documents also include chunks, instead of knitting into .md I would knit the main document into another .Rmd as many times as I have chunks nested, e.g., continuing the example above:

  1. Using knit(input = './main.Rmd', output = './main_2.Rmd'), instead of knitting main.Rmd into main.md, I would knit it into another .Rmd so as to be able to knit the resulting file containing the newly imported chunks, e.g., my R script analysis.R above.
  2. I can now knit my main_2.Rmd into main.md or render it as main.html via rmarkdown::render(input = './main_2.Rmd', output_file = './main.html').

Note: in the example above of main.md, the path to my R script is ../stats/analysis.R. This is the path relative to the child document that sourced it, ./sections/04-results.Rmd. Once I import the child document into the main document located at the root of my_directory, i.e., ./main.md or ./main_2.Rmd, the path becomes wrong; I therefore must correct it manually to ./stats/analysis.R before the next knit.

I mentioned above that it is best to have the child documents saved in the same order that they are imported into the main document. This is because my simple function extraction() simply stores the contents of all the files specified to it in an unnamed list, hence I must access each file in main.Rmd by number, i.e., rmd[[5]][["contents"]] refers to the child document ./sections/05-discussion.Rmd; consider:

> str(rmd)
List of 6
 $ :List of 4
  ..$ title     : chr "child doc 1"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 1**, my abstract."
 $ :List of 4
  ..$ title     : chr "child doc 2"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 2**, my introduction.\n\n- Point 1\n- Point 2\n- Point *n*"
 $ :List of 4
  ..$ title     : chr "child doc 3"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 3**, my \"Methods\" section.\n\n| method 1 | method 2 | method *n* |\n|--------------|--------------|----"| __truncated__
 $ :List of 4
  ..$ title     : chr "child doc 4"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 4**, my \"Results\" section.\n\n## Result 1\n\n```{r}\nlibrary(knitr)\n```\n\n```{r, cache = FALSE}\nsour"| __truncated__
 $ :List of 4
  ..$ title     : chr "child doc 5"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 5**, where the results are discussed."
 $ :List of 4
  ..$ title     : chr "child doc 6"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 6**, where I state my conclusions."

So, extraction() here is actually storing both the R Markdown contents of the specified child documents, as well as their YAML, in case you had a use for this as well (I myself do).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top