Using the archive R package to read and write tar.gz and other archive files (CC250)

Опубликовано: 22 Сентябрь 2022
на канале: Riffomonas Project
2,236
55

The archive R package allows you to read and write tar.gz and other archive files so that you don't have to use the command line commands to do the extraction. This is convenient because they allow you to incorporate these functions into your dplyr pipelines. In this episode Pat demonstrates how to use the functions from the package and then attempts to extract 122,000 files from a large archive. The overall goal of this project is to highlight reproducible research practices using a number of tools. The specific output from this project will be a map-based visual that shows the level of drought across the globe.

You can find my blog post for this episode at https://riffomonas.org/code_club/2022....

#archive #R #tar #Rstats

Support Riffomonas by becoming a Patreon member!
  / riffomonas  

Want more practice on the concepts covered in Code Club? You can sign up for my weekly newsletter at https://shop.riffomonas.org/youtube to get practice problems, tips, and insights.

If you're interested in taking an upcoming 3 day R workshop be sure to check out our schedule at https://riffomonas.org/workshops/

You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: https://www.riffomonas.org/minimalR/
General data: https://www.riffomonas.org/generalR/

0:00 Introduction
3:39 Creating compressed archives from within R
8:00 Inspecting the contents of an archive with R
9:36 Reading from compressed archives with R
12:11 Reading multiple files from the same archive
14:00 Using archive to read from large archive
18:22 Estimating runtime for 122,000 files