I've been excited about writing my research in Rmarkdown for something like five years now, and have written basically all my academic work with it. The biggest project was my dissertation thesis, a monograph written entirely in Rmd. When writing something that large, or, actually, anything you might want to split into different sections, you should definitely check out a recent development in the Rmarkdown universe, namely, the Bookdown package. In this post, I'll go through my basic setup of a new bookdown project.
1. Setting up the project folder
I start with a fresh new folder with the following structure:
bookdown/
├── 01-intro.Rmd
├── aux
│ ├── preamble.tex
│ ├── refs.bib
│ └── utaltl.csl
├── _bookdown.yml
└── _output.yml
Bookdown will try to get all the rmd files in the project folder and turn them into a single document. To make sure you have the right order of files in the final output, it's safest to prefix the files with numbers.
The configuration of bookdown is done through the two yaml
files, _bookdown.yml
and'_output.yml
AND with a yaml block
at the beginning of the first file of the project (in this case, 01-intro.Rmd
).
Basic configuration
In a project I recently started, this is what the yaml block
in 01-intro.Rmd
looks like:
latexBackend: linguex
exampleRefFormat: "{}"
title: "Last year but not yesterday? Explaining differences in the locations of Finnish and Russian time adverbials using comparable corpora"
author: Juho Härme
bibliography: aux/refs.bib
link-citations: true
The options latexBackend and exampleRefFormat are specific to my projects: they have to do with using linguistic examples in the text, which I do with my own fork of a pandoc filter called pangloss. These lines should be left out if such a filter is not used.
Bibliography management and different outputs, output modification using cls
or bib(la)tex and all those kinds of factors definitely deserve
a separate post. For simplicity's sake, here I only
use the default settings and simply define the location of the .bib
file containing my references in the format of a bibtex database.
The file _bookdown.yaml
is pretty simple, in this
project I only have one line:
delete_merged_file: true
Output configuration
When writing Rmarkdown files, you can specify multiple output
formats in _output.yml
and if you compile your
book with the render_book function,
without additional arguments, all those output formats will be produced.
Here, I have specified two output formats, html and pdf (produced with latex). The html is something I use when sketching and building the article / book. At that stage I usually have the pdf parts commented out. When the work is getting closer to be finished, I tend to switch to the pdf output.
Here's what my _output.yml
looks like
bookdown::gitbook:
bookdown::pdf_book:
toc: true
toc_depth: 6
includes:
in_header: aux/preamble.tex
latex_engine: xelatex
keep_tex: true
pandoc_args:
- --filter
- pangloss
- --filter
- pandoc_avm
Auxiliary files
As can be seen in the folder structure above, I use a separate folder called aux to store
my auxiliary files in. These include the bibliography database, possibly some additional pandoc
filters, cls files and the like. One especially important file is the preamble.tex
, which
loads all the needed latex packages and adjusts the final document's layout.
Compiling the book
I will probably write something about how to do this in Rstudio later on. Here's how the compilation of the book happens in R terminal.
First, make sure to (install and) load the bookdown library
library(bookdown)
Then, just run the r render_book
function by specifying at least one file to be included
in the book:
render_book("01-intro.Rmd")
I actually tend to have a separate .Rprofile
file
which includes all the libraries that need to be loaded
etc. Place the file in the project's directory and
you'll have you workspace set up the way you want it.
After the compilation, you'll see that
the final output will end up in a folder called _book
which also includes a whole bunch
of auxiliary files, especially for the gitbook type of html format.
Here's the structure of my bookdown folder
after the compilation process:
bookdown/
├── 01-intro.Rmd
├── aux
│ ├── preamble.tex
│ ├── refs.bib
│ └── utaltl.csl
├── _book
│ ├── a-real-section.html
│ ├── introduction.html
│ ├── libs
│ │ ├── gitbook-2.6.7
│ │ │ ├── css
│ │ │ │ ├── fontawesome
│ │ │ │ │ └── fontawesome-webfont.ttf
│ │ │ │ ├── plugin-bookdown.css
│ │ │ │ ├── plugin-fontsettings.css
│ │ │ │ ├── plugin-highlight.css
│ │ │ │ ├── plugin-search.css
│ │ │ │ └── style.css
│ │ │ └── js
│ │ │ ├── app.min.js
│ │ │ ├── jquery.highlight.js
│ │ │ ├── lunr.js
│ │ │ ├── plugin-bookdown.js
│ │ │ ├── plugin-fontsettings.js
│ │ │ ├── plugin-search.js
│ │ │ └── plugin-sharing.js
│ │ └── jquery-2.2.3
│ │ └── jquery.min.js
│ ├── _main.pdf
│ ├── _main.tex
│ └── search_index.json
├── _bookdown.yml
└── _output.yml
The actual pdf output is the file _book/_main.pdf
. For the html
output, just pick the name of the first html file, in this case
introduction.html
.