Strip YAML from child docs in knitr

Question

In the mean time, maybe the following will work for you; it is kind of an ugly and inefficient work-around (I am new to knitr and am not a real programmer), but it achieves what I believe you are wanting to do.

I had written a function for a similar personal use that includes the following relevant bit; the original is in Spanish, so I've translated it some below:

extraction <- function(matter, escape = FALSE, ruta = ".", patron) {

  require(yaml)

  # Gather together directory of documents to be processed

  doc_list <- list.files(
    path = ruta,
    pattern = patron,
    full.names = TRUE
    )

  # Extract desired contents

  lapply(
    X = doc_list,
    FUN = function(i) {
      raw_contents <- readLines(con = i, encoding = "UTF-8")

      switch(
        EXPR = matter,

        # !YAML (e.g., HTML)

        "no_yaml" = {

          if (escape == FALSE) {

            paste(raw_contents, sep = "", collapse = "\n")

          } else if (escape == TRUE) {

            require(XML)
            to_be_escaped <- paste(raw_contents, sep = "", collapse = "\n")
            xmlTextNode(value = to_be_escaped)

          }

        },

        # YAML header and Rmd contents

        "rmd" = {
          yaml_pattern <- "[-]{3}|[.]{3}"
          limits_yaml <- grep(pattern = yaml_pattern, x = raw_contents)[1:2]
          indices_yaml <- seq(
            from = limits_yaml[1] + 1,
            to = limits_yaml[2] - 1
            )
          yaml <- mapply(
            FUN = function(i) {yaml.load(string = i)},
            raw_contents[indices_yaml],
            USE.NAMES = FALSE
            )
          indices_rmd <- seq(
            from = limits_yaml[2] + 1,
            to = length(x = raw_contents)
            )
          rmd<- paste(raw_contents[indices_rmd], sep = "", collapse = "\n")
          c(yaml, "contents" = rmd)
        },

        # Anything else (just in case)

        {
          stop("Matter not extractable")
        }

      )

    }
    )

}

Say my main Rmd document main.Rmd lives in my_directory and my child documents, 01-abstract.Rmd, 02-intro.Rmd, ..., 06-conclusion.Rmd are housed in ./sections; note that for my amateur function it is best to have the child documents saved in the order they will be summoned into the main document (see below). I have my function extraction.R in ./assets. Here is the structure of my example directory:

.
+--assets
|  +--extraction.R
+--sections
|  +--01-abstract.Rmd
|  +--02-intro.Rmd
|  +--03-methods.Rmd
|  +--04-results.Rmd
|  +--05-discussion.Rmd
|  +--06-conclusion.Rmd
+--stats
|  +--analysis.R
+--main.Rmd

In main.Rmd I import my child documents from ./sections:

---
title: Main
author: me
date: Today
output:
  html_document
---

```{r, 'setup', include = FALSE}
opts_chunk$set(autodep = TRUE)
dep_auto()
```

```{r, 'import_children', cache = TRUE, include = FALSE}
source('./assets/extraction.R')
rmd <- extraction(
  matter = 'rmd',
  ruta = './sections',
  patron = "*.Rmd"
  )
```

# Abstract

```{r, 'abstract', echo = FALSE, results = 'asis'}
cat(x = rmd[[1]][["contents"]], sep = "\n")
```

# Introduction

```{r, 'intro', echo = FALSE, results = 'asis'}
cat(x = rmd[[2]][["contents"]], sep = "\n")
```

# Methods

```{r, 'methods', echo = FALSE, results = 'asis'}
cat(x = rmd[[3]][["contents"]], sep = "\n")
```

# Results

```{r, 'results', echo = FALSE, results = 'asis'}
cat(x = rmd[[4]][["contents"]], sep = "\n")
```

# Discussion

```{r, 'discussion', echo = FALSE, results = 'asis'}
cat(x = rmd[[5]][["contents"]], sep = "\n")
```

# Conclusion

```{r, 'conclusion', echo = FALSE, results = 'asis'}
cat(x = rmd[[6]][["contents"]], sep = "\n")
```

# References

I then knit this document and only the contents of my child documents are incorporated thereinto, e.g.:

---
title: Main
author: me
date: Today
output:
  html_document
---





# Abstract


This is **Child Doc 1**, my abstract.

# Introduction


This is **Child Doc 2**, my introduction.

- Point 1
- Point 2
- Point *n*

# Methods


This is **Child Doc 3**, my "Methods" section.

|    method 1   |    method 2   |   method *n*   |
|---------------|---------------|----------------|
| fffffffffffff | fffffffffffff | fffffffffffff d|
| fffffffffffff | fffffffffffff | fffffffffffff d|
| fffffffffffff | fffffffffffff | fffffffffffff d|

# Results


This is **Child Doc 4**, my "Results" section.

## Result 1

```{r}
library(knitr)
```

```{r, 'analysis', cache = FALSE}
source(file = '../stats/analysis.R')
```

# Discussion


This is **Child Doc 5**, where the results are discussed.

# Conclusion


This is **Child Doc 6**, where I state my conclusions.

# References

The foregoing document is the knitted version of main.Rmd, i.e., main.md. Note under ## Result 1 that in my child document, 04-results.Rmd, I sourced an external R script, ./stats/analysis.R, which is now incorporated as a new knitr chunk in my knitted document; consequently, I now need to knit the document again.

When child documents also include chunks, instead of knitting into .md I would knit the main document into another .Rmd as many times as I have chunks nested, e.g., continuing the example above:

Using knit(input = './main.Rmd', output = './main_2.Rmd'), instead of knitting main.Rmd into main.md, I would knit it into another .Rmd so as to be able to knit the resulting file containing the newly imported chunks, e.g., my R script analysis.R above.
I can now knit my main_2.Rmd into main.md or render it as main.html via rmarkdown::render(input = './main_2.Rmd', output_file = './main.html').

Note: in the example above of main.md, the path to my R script is ../stats/analysis.R. This is the path relative to the child document that sourced it, ./sections/04-results.Rmd. Once I import the child document into the main document located at the root of my_directory, i.e., ./main.md or ./main_2.Rmd, the path becomes wrong; I therefore must correct it manually to ./stats/analysis.R before the next knit.

I mentioned above that it is best to have the child documents saved in the same order that they are imported into the main document. This is because my simple function extraction() simply stores the contents of all the files specified to it in an unnamed list, hence I must access each file in main.Rmd by number, i.e., rmd[[5]][["contents"]] refers to the child document ./sections/05-discussion.Rmd; consider:

> str(rmd)
List of 6
 $ :List of 4
  ..$ title     : chr "child doc 1"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 1**, my abstract."
 $ :List of 4
  ..$ title     : chr "child doc 2"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 2**, my introduction.\n\n- Point 1\n- Point 2\n- Point *n*"
 $ :List of 4
  ..$ title     : chr "child doc 3"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 3**, my \"Methods\" section.\n\n| method 1 | method 2 | method *n* |\n|--------------|--------------|----"| __truncated__
 $ :List of 4
  ..$ title     : chr "child doc 4"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 4**, my \"Results\" section.\n\n## Result 1\n\n```{r}\nlibrary(knitr)\n```\n\n```{r, cache = FALSE}\nsour"| __truncated__
 $ :List of 4
  ..$ title     : chr "child doc 5"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 5**, where the results are discussed."
 $ :List of 4
  ..$ title     : chr "child doc 6"
  ..$ layout    : chr "default"
  ..$ etc       : chr "etc"
  ..$ contents: chr "\nThis is **Child Doc 6**, where I state my conclusions."

So, extraction() here is actually storing both the R Markdown contents of the specified child documents, as well as their YAML, in case you had a use for this as well (I myself do).