With ggplot2, the dplyr R package is the foundation of the tidyverse. In this episode of Code Club, Pat shows how to use dplyr to clean and join data generated from the #mothur software package. He will cover select, rename, rename_all, mutate, separate, pivot_longer, str_replace, str_replace_all, group_by, summarize, inner_join, anti_join, and more. In this overview, you'll get a sense of how powerful dplyr is for working with data.
Pat will use RStudio and functions from #dplyr and the rest of the tidyverse further demonstratin the power of #R. The accompanying blog post can be found at https://www.riffomonas.org/code_club/2021-05-07-dplyr-overview.
Do you have a figure that you would like to receive a critique or help improving? Let me know and I'd be happy to arrange a guest appearance!
If you're interested in taking an upcoming 3 day R workshop, email me at [email protected]!
R: https://r-project.org
RStudio: https://rstudio.com
Raw data: https://github.com/riffomonas/raw_data/releases/latest
Workshops: https://www.mothur.org/wiki/workshops
You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: https://www.riffomonas.org/minimalR/
General data: https://www.riffomonas.org/generalR/
0:00 Overview
6:02 Cleaning up metadata
8:26 Cleaning up OTU counts table
11:39 Cleaning up taxonomy data
17:54 Joining data frames
21:05 Calculating relative abundances
23:17 Tidying by taxonomy
24:53 Conclusion