Question

I'd like to know how to compile multiple pandoc files into one output document, where each input file has a title block.

E.g. suppose I have two files:

ch1.md:

% Chapter 1
% John Doe
% 1 Jan 2014
Here is chapter 1.

ch2.md:

% Chapter 2
% Jane Smith
% 3 Jan 2014
Here is chapter 2.

Typically with multiple input files you can compile them by providing them to pandoc:

pandoc ch1.md ch2.md --standalone -o output.html

However pandoc concatenates the input files before compiling, meaning only the first title block (from ch1.md) is styled appropriately. I would like each title block to be styled appropriately (e.g. in html, the first line of the title block is styled with <h1 class="title">, the second <h2 class="author"> and so on).

(Note: I have also tried compiling each chapter as standalone separately, then concatenating these together using pandoc. This removes the title styling for chapters after 1, though keeps styling for the authors/date).

Why? I can:

  • compile each chapter in its own separate document and the author/title/date is marked up appropriately
  • compile the entire document together and author/title/date is marked up appropriately for each chapter (can use the --chapters option)

I could just specify the heading with '#' (h1), author with '##' (h2), and date with '###' (h3) in each chapter file directly but this means pandoc doesn't "know" what the title/heading/date of my document are, so (e.g.) if I compile to latex it won't use the \date{} or \author{} tags appropriately.

Was it helpful?

Solution

I wrote a pandoc filter that when run on each individual chapter's file, inserts the title block as headings (level 1 for title, level 2 for author, level 3 for date. This is what the HTML writer does).

This lets you run pandoc on each chapter individually (to produce the pandoc'd output plus the formatted title block), and then run pandoc on all the chapters together to compile the single document.

The filter is here on gist (I take no responsibility for malfunctioning code, etc): https://gist.github.com/mathematicalcoffee/e4f25350449e6004014f

You could modify it if you wanted it to format differently (for example like this the author/date appear in the table of contents since they are headings, which is not quite right... but that's a different problem as it happens with the default HTML writer too).

My workflow is now something like this:

FORMAT=latex  # as understood by -t <format> in pandoc
FLAGS=--toc   # other flags for pandoc, --smart, etc
OUT=pdf       # output extension
for f in Chapter*.md; do \
    pandoc $FLAGS -t $FORMAT --filter ./chapter.hs $f; \
    echo ""; \
done | pandoc $FLAGS --standalone -o thesis.$OUT

where I've chmod +x chapter.hs and it's in the current directory.

(I additionally have a title.txt that I stick out the front with the entire thesis' title block (as opposed to each chapter's title block)).

I received some help from the pandoc google group which was great.

OTHER TIPS

You can't do this with the % title blocks, but you can do it with the new YAML title blocks.

Start each document like this:

---
title:  Chapter One
author:  Me
date: June 4
...

When the documents are concatenated together, the first value set will take precedence over the others, so the subsequent YAML lines using the same parameter (e.g. "title:") will be ignored. (See the readme under "Extension: yaml_metadata_block".)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top