R Markdown Guide: Code Chunks, Knitting & Reports
Data scientists and statistical analysts need to present research findings along with the underlying code. In this scenario, understanding what is R Markdown is highly valuable. R Markdown integrates text comments, R code execution, and data plots in a single file format.
This 2,000-word guide teaches you how to format a YAML metadata header, write code chunks, and compile an R Markdown PDF or HTML document.
1. Core Architecture of R Markdown
An R Markdown file (extension .Rmd) combines three main parts:
- YAML Metadata Header: Declared at the top of the file enclosed in triple dashes (
---). It sets document parameters like title and export outputs. - Markdown Syntax: Standard CommonMark syntax used for headings, lists, bold text, and hyperlinks.
- Code Chunks: Execution blocks that run R, Python, or SQL code and print results or plots directly inside the document.
This structure enables reproducible research. By keeping analysis code and narrative text in a single document, other researchers can run the file and verify the results. This approach reduces errors compared to copy-pasting numbers from spreadsheets into word processors.
In practice, R Markdown is the standard reporting format across research groups, business intelligence teams, and academic labs. The format remains readable in raw text, making it compatible with version control systems like Git.
2. Formatting YAML Headers
The YAML header defines document metadata. It must sit at the top of the file. Below is a table detailing common YAML keys:
| YAML Key | Description | Example Value |
|---|---|---|
| title | The title of the generated document | "Annual Sales Analysis" |
| author | The name of the author | "John Smith" |
| date | Publishing date | "2026-06-11" |
| output | Target format (HTML, PDF, Word) | pdf_document or html_document |
YAML is indentation-sensitive. You must use spaces (never tabs) to align sub-properties. For example, to add a floating table of contents to an HTML document:
---
title: "Quarterly Analysis"
output:
html_document:
toc: true
toc_float: true
theme: united
---An incorrect indentation can cause the compiler to fail during knitting.
3. Fenced Code Chunks & Options
Code chunks execute code and display the results. Declare a code chunk using three backticks followed by curly braces containing the language key:
```{r chunk-name, echo=FALSE, warning=FALSE}
# R code runs here
summary(cars)
```You can configure chunk behaviors using options separated by commas:
- echo=TRUE / FALSE: Controls whether the raw code is displayed in the final document. Use `FALSE` for reports intended for business stakeholders.
- eval=TRUE / FALSE: Controls whether the code inside the chunk is executed. Use `FALSE` to show example code without running it.
- warning=TRUE / FALSE: Hides console warning messages in the final report, keeping the layout clean.
- message=TRUE / FALSE: Controls whether package loading logs are displayed.
- fig.width / fig.height: Sets the dimensions of generated plots in inches.
In addition to R, you can execute code in other languages (such as Python or SQL) by specifying the language key in the curly braces: {python} or {sql, connection=db}.
4. Inline Code Execution
Sometimes you need to insert variables directly inside a sentence. R Markdown supports inline code execution using single backticks starting with the letter `r`:
We analyzed `r nrow(dataset)` samples, and the average value was `r mean(dataset$value)`.
When the document is knitted, these tags are replaced with the computed values. For example: *"We analyzed 150 samples, and the average value was 42.5."*
This feature ensures your narrative text matches your analysis data. If the dataset updates, knitting the document again automatically updates all inline numbers.
5. Knitting Reports (HTML, PDF, Word)
To compile the report, click the **Knit** button in RStudio. Under the hood, RStudio runs these steps:
- The **knitr** package runs all R code chunks and generates a standard markdown (.md) document containing the outputs.
- The **Pandoc** converter parses the markdown file and compiles it into your target format (HTML, PDF, or Word).
- If compiling to a PDF document, RStudio runs a TeX distribution on your system to format margins and page breaks.
The separation of knitr and Pandoc is an important aspect of R Markdown's design. Knitr processes the code chunks, while Pandoc handles the final styling and document compilation. This separation allows you to export the same raw code to multiple targets.
By defining specific parameters in your YAML header, you can configure how Pandoc compiles the document. For instance, you can specify custom CSS stylesheets for HTML outputs, or add LaTeX styling packages for PDF generation. This setup gives you control over the page layout, font selections, headers, and footers without changing the underlying analysis code.
For example, you can add a custom LaTeX package in your YAML header to change page geometry margins:
---
title: "Quarterly Analysis"
output: pdf_document
header-includes:
- \usepackage{geometry}
- \geometry{margin=1in}
---6. Common Pitfalls & Troubleshooting
When writing R Markdown files, researchers often encounter these common compilation errors:
- Indentation Errors in YAML: Using tabs instead of spaces or misaligning keys will cause the Pandoc parser to fail. Check indentation lines carefully if your document does not compile.
- Missing TeX Distribution: Compiling to PDF requires a LaTeX engine. If you do not have LaTeX installed, the knit process will fail. We recommend installing the TinyTeX package in R to resolve this:
tinytex::install_tinytex(). - Variables Defined Out of Order: Knitr runs chunks sequentially from top to bottom. If a code block references a variable defined in a lower chunk, the build will fail. Ensure all library loads and variable declarations are placed at the top of the file.
- Directory Path Issues: By default, knitr runs code chunks relative to the directory containing the `.Rmd` file. If your code references external data files using relative paths, verify that the files exist in the same folder or configure the working directory settings.
To prevent errors, run chunks individually in your editor before knitting the entire document. This practice helps isolate code errors quickly.