In today's episode of Code Club, Pat Schloss discusses and demonstrates the tradeoffs between putting compute steps into an R Markdown document versus in a separate R script to optimize the speed of rendering the document. Basically, where should the heavy lifting of your analysis be done? Often compute steps can be slow and interfere with efficiently writing and formatting documents. Doing them in a separate script that outputs data that can be loaded into the R mardown document or into another script for generating a plot is an effective alternative.
This episode is part of a larger arc of episodes investigating the sensitivity and specificity of amplicon sequence variants (ASVs), also known as exact sequence variants (ESVs). ASVs are growing in popularity for analyzing microbial communities using 16S rRNA gene sequences. Proponents think that they should supplant operational taxonomic units (OTUs). What do you think? Pat demonstrates these concepts by live coding at the command line interface using GitHub Flow, Make, and RStudio.
The accompanying blog post can be found at http://www.riffomonas.org/code_club/2...
You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: http://www.riffomonas.org/minimalR/
General data: http://www.riffomonas.org/generalR/
0:00 Introduction
3:21 Outlining threshold problem
8:22 Creating function
14:36 Replicating function and processing data
18:33 Converting into an executable script
26:28 Importing and processing data in R markdown
34:46 Conclusion