Unfortunately, I haven’t had as much time to make blog postsin the past year or so.I started taking classes as part ofGeorgia Tech’s Online Master of Science in Analytics (OMSA)program last summer (2018) while continuing to work full-time, so extra timeto code and write hasn’t been abundant for me.
Anyways, I figured I would share one neat thing I learnedas a consequence of taking classes—writing compact“cheat sheets”with {rmarkdown}
. 1
R Markdown Cheat Sheet learn more at rmarkdown.rstudio.com rmarkdown 0.2.50 Updated: 8/14 1. Workflow R Markdown is a format for writing reproducible, dynamic reports with R. Use it to embed R code and results into slideshows, pdfs, html documents, Word files and more.
Run rmarkdown::render(') 2. Click the knit HTML button at the top of the RStudio scripts pane When you render, R will. execute each embedded code chunk and insert the results into your report. build a new version of your report in the output file type. open a preview of the output file. The R Markdown cheatsheet is a one page (two-sided) reference guide you can download as a quick reference while you work. The R Markdown Reference is a five page guide that lists each of the options from markdown, knitr, and pandoc that you can use to customize your R Markdown documents. You can access both files from within the RStudio IDE. The cheat sheet above lists the following syntax to generate a bulleted list in R Markdown. A cheat sheet for understanding and writing in Markdown and YAML.
Writing with {rmarkdown}
is fairly straightforward—mostlythanks to an abundance of freely available learning resources, like theR Markdown: The Definitive Guide—and usingCSS to customize your Rmarkdown output to your likingis not too difficult either.(By the way, huge shout-out to Yihui Xieand everyone else who has contributedto the development of the {rmarkdown}
package.)My objective was to make an extremely compact PDFthat minimizes all white space 2.Despite my knowledge of CSS,I had a hard time getting an output that I liked purely from CSS,so I looked online to see if I could find some good LaTex templates.(After all, I would be knitting the Rmarkdown document to PDF,and LaTex would be incorporatedvia the equations on the cheat sheet.)Some templates I found worked fine but weren’t completely to my liking. 3
In my search for an “ideal” template, I stumbled upon a small tidbitin the very last portion of thePDF chapter of the R Markdown bookstating “You can also replace the underlying pandoc template using the template option”. 🤔
At first, I was a bit intimidated by the idea of writing my own template.(“I have to write my own template from scratchusing a framework (LaTeX) that I’ve hardly even touched before now! 😨”)But alas, the task became less intimidating when I realized thatI could use the tried-and-true method of copying-pasting-modifyingfrom Stack Overflow!
The Template
Using the template fromthis Stack Overflow post4 as a basis, I endedup creating a relatively minimal template.For the curious reader, see this GitHub repo,for the latest version of my template. It also includes an example cheat sheet.
The “gist” of my template is shown below.
The key for me was to understand how pandoc variableslike $body$
are used as placeholders for user-supplied content.(I know I haven’t mentioned pandoc up to this point,but suffice it to say thatit—along with the R package {knitr}
—are what power the {rmarkdown}
package.)
The multicols
command shown in the snippet above is also noteworthy. ThisLaTex command provides the functionality for I wanted most for mycheat sheet—multiple columns of content!I should point out that there are in_header
, before_body
, and after_body
YAML options for customizing PDF output with {rmarkdown}
. 5
These options are probably sufficient for most people’s customization needs(so using a custom template would not be necessary).But for me personally, the appeal of having “complete” controlof my output by using a template convinced me to forego these options. 6
Usage
So, exactly how do you use a custom template with {rmarkdown}
?It’s as simple as specifying the path to your template file with the template
option in the YAML header of your Rmarkdown document. 7
Why This Way?
Before I was using Rstudio and {rmarkdown}
to write my cheat sheets,I tried out a couple of LaTex editors 8.First, I tried the very popular Overleaf.It is well known and commonly used becauseit is web-based, allows the user to collaborate in real-time, andprovides real-time previewing 9.However, there was just something that felt “clunky” about the editor, andthe ambiguity over package versions and usage was bothersome to me. 10The other editor I tried for some time was TeXworks(with the pdftex distribution)Using the “Typset” command to generate my PDF output on an ad-hoc basis seemed to meto be a satisfactory workflow, but, among other things, I felt limited by the customizationoffered by TeXworks. 11
And so I turned to Rstudio and {rmarkdown}
and didn’t look back.While learning how to create a custom template was a (minor) inconvenience,it has paid off in a number of ways:
I can use a familiar editor—Rstudio.
I can use a familiar workflow—writing in an Rmarkdown document and
knit
ting to create my desired output.Because I’m using
{rmarkdown}
, I can use{rmarkdown}
functionality that is not available when solely writing in LaTex.
This last point is huge.The whole world of markdown syntax is valid!For example,I can add emphasis to text with markdown’s **
and __
tokens (instead of LaTex’s more “verbose” syntax);I can use #
to define section headers (which I just think is super “elegant”);and I can use HTML comments to comments out multiple lines of text.(Note that native LaTex only has a single-line comment token—%
. 12)Additionally, beyond just the markdown functionality, I can include R
codethanks to added layer of functionality offered by {rmarkdown}
.
The one big thing that I feel like I “sacrificed” by moving to Rstudio and {rmarkdown}
is the live preview feature that comes with Overleaf (and can be emulatedwith some configuration in other LaTex editors). Nonetheless, I feel like I geta reasonable facsimile of this feature with Rstudio’s functionalityfor inline previews of equations. 13Below are examples of the preview capabilities for both single- andmulti-line equations.
What Works for Me May Not Work For You
Although what I’ve described in this post has been working well for me—andI’d encourage others to try it out—I don’t claim itto be the “best” solution for all of your cheat sheet needs. 14If you’ve got a workflow that works for you, that’s great! Keep using it!Be pragmatic.
- For those unfamiliar with the concept of a cheat sheet, there’s no malice in it, despite what the moniker implies. From my experience, it is relatively common for teachers to let students use self-created note sheets (i.e. cheat sheets) for aid with taking exams. ^
- in order to maximize the amount of space used for content, of course ^
- One of the ones that I really liked was this one. However, it’s a bit more complex than I wanted. (This one implements a “structure” in which one “main” tex file references several others with the
input
Latex command.) ^ - which was super helpful for a LaTex noob like me because it has comments explaining what specific lines/sections are doing ^
- See the PDF chapter of the R Markdown book for some guidance with these. ^
- I’m sure I could create a perfectly fine cheat sheet using just these options, or, even re-create the output that I have achieved with my template. ^
- You can specify other options as well, such as
keep_latex: true
for an alternative LaTex engine withlatex_engine
. ^ - and there are lots of them out there^
- The live preview feature is probably my favorite of all. ^
- Others may view the hands-off approach to package management as an advantage of using Overleaf. ^
- Perhaps this is the fault of my own. Perhaps all the customization that I would like exists and I just have not discovered how to enable it. ^
- I realize that you can define custom commands or use a package to create multi-line comments in LaTex, but that ruins the point that I’m trying to make 😊. ^
- See the “Show equation and image previews” option in Tools > Global Options… > R Markdown. ^
- I wouldn’t be surprised if I find a better workflow for myself in the future. ^
Complete, neat and thorough documentation of our research is something that we probably all aim to achieve. In the wet lab, lab notebooks are essential and some labs are migrating to online versions like LabArchives. (Update 27-9-17: LabArchives now supports Markdown Syntax!) For bioinformaticians, documentation of code commonly goes on GitHub. However, as a biologically-trained student entering the realm of bioinformatics it was not always clear how best to document my analyses, as this most often involved using commands to run other people’s code rather than writing my own. I moved to writing short bash scripts to run different tools, but there was still an awful lot I wanted to write down regarding what I was learning as I went, not to mention the importance of recording everything that went wrong.
In the past
Initially, I was using LabArchives for this documentation, where I would copy over the commands I had run and attach output files or screenshots of output. This was okay for a while; it had the benefit of timestamped entries and I highly recommend it for wet lab experiments, protocols and as a place to start for documenting your computational stuff. However, I felt that it became a bit messy and difficult to navigate as I was exploring different methods - particularly when I started doing analysis in R as well.
I arrived at R Markdown when I was beginning to learn R. I began by using R scripts to document my R code, but soon realised that I wanted to write more about what I was doing than looked neat amongst all the ######
. I also needed to continue documenting the work I was doing on the command line, and wanted to keep everything in once place. I now use R Markdown daily to document almost every command I type. I find it a fantastic way to keep track of the different things I may be doing; including running some commands on my data, installing and learning how to use a new tool or working through errors or problems in my data. I use a separate R Markdown document for each separate thing and name them appropriately so it is easy for me to revisit.
What is R Markdown?
My best explanation of R Markdown is that it is my primary tool for keeping thorough, reproducible documentation of my bioinformatics analyses. My poor explanation of what it actually is is that it’s a tool to generate nice-looking reports with plain text (the markdown part) and embedded code (the R part). Importantly, the document will not be created unless the R code functions, because it runs the code as it creates your report. This means you have documentation containing the exact code that works; no typos. You can also have code blocks with other languages like bash
. knitr is the important package that supports text written in R Markdown and turns it into something that looks nice; so this is referred to as “knitting” the document. Commonly, you will knit
your document into a HTML page, or if you have pandoc
installed, into a PDF.
LaTeX produces the same pleasant-looking documents and is something I had intended to learn, however I have found R Markdown/knitr much more straightforward; the downside is that you have less control over how printable pages will look.
Markdown Quick Reference
Get started with R Markdown (on Windows)
The R Markdown website has some great tutorials, but I will admit that I haven’t watched these - I just got stuck into it. It may look tricky to learn, as the font in the plain text window looks like code
, which may turn you off if you are unfamiliar with the command line. However, I promise that it is super easy to use and once you know the basics you just focus on the writing. Here is how I set up R Markdown as I use it each day on my Windows machine:
1. Install R and RStudio.
If you want to be able to produce R Markdown documents as PDFs, you will also need to install MiKTeX but note that you need the complete installation by using the Net Installer. (I found that the MiKTeX download would interrupt and it cannot be resumed; downloading and installing proTeXt worked better for me.)
2. Open R studio
and select to create a new R Markdown document
3. Hit yes
to install all of the R packages required for R Markdown.
In my case, knitr
and rmarkdown
didn’t install this way. If this happens, in the console window of RStudio type:
Then you will need to select to open a new R Markdown document again.
4. Name your document
and select the output type - as it says, you can select HTML and then later convert to PDF (if you have MiKTeX installed).
5. Load any R packages at the top
Cheat Sheet R Markdown
If you will be writing R code that requires you to load installed packages, you can write them at the top (underneath knitr::opts_chunk$set(echo = TRUE)
) as library(packagename)
like so:
6. Start writing!
There is a neat cheat sheet which should cover everything you need. It can be a bit much to look at, so here is a list of the syntax I use most frequently (these are in the “Pandoc’s Markdown” section):
- Headers. Begin a line with
#
and you have a level 1 header - add more#
s for subheadings and beyond. The font size will decrease accordingly. - Bullet points. Begin a line with
*
followed by a space to create a bullet point. Sub-points are done with+
after a tab indent. - Numbered lists. As simple as beginning a line with
1.
followed by a space. - Bold and italic; use two asterisks around the text
**like so**
for bold and*one*
for italic. - Inserting pictures. Do so with
![]()
where you put the description of the picture within the square brackets, and the path to the image file in parentheses.
For the code blocks where I write lines of R code, I begin a code block with ` {r}`. The following lines contain the code, then the block ends with `
` . For bash code, where I record the commands I run in bash scripts and for command line tools, {r}
becomes bash
. In both types of code you can use #
to begin a comment line, which are useful to provide information on what the code is doing without having to break it up into small blocks with text around them.
7. Knit the document.
You can create a HTML output (viewable in an internet browser) or a PDF document. Sometimes, I don’t knit the document at all - you don’t have to.
To Knit, just hit knit! You can choose the output type if you use the little arrow beside it.
Easy! There is much more to learn, but I primarily use R Markdown the above features are what I most commonly use. My habit is to write the outline for what I’m planning to do and why (like a protocol), then run the commands (do the experiment) before copying them into the document and writing about what happened (writing in a lab notebook).
In fact, it’s quite reasonable to use R Markdown as a lab notebook if you do a lot of computational work; see how Tim Stuart does this. As he describes, if you do lots of analysis, having several separate R Markdown documents can also get messy so there is an advantage to going a step further to keep them all in the one place.
I hope that this was a straightforward introduction to R Markdown and that you will consider using it or something similar to document your bioinformatics analysis if you don’t do so already (or if history > log.txt
is as far as you go). It’s good for yourself, your papers (I’m building a habit of providing a link in the paper to all my documented analysis, which I host on GitHub Pages) and your supervisors or collaborators if they would like to see exactly what you’ve done, which can be useful for all involved.
If you have any comments or questions, please email me or find me on Twitter.