DuckDB and duckplyr: An in-process database management system in your R script - Gabor Szarnyas
Abstract: DuckDB is an open-source analytical database management system with clients for several languages, including R. DuckDB offers the functionality of a database system, including high performance, persistence and full SQL support. At the same time, DuckDB has a small footprint with no dependency on an external server: it is trivial to deploy using the library(duckdb) call. DuckDB also has deep integrations into client libraries such as dplyr using the duckplyr package. In this talk, I explain how DuckDB achieves its high performance, demonstrate its performance through a live demo and showcase its R integrations.
Resources mentioned in the talk:
- DuckDB: An in-process SQP OLAP database management system https://duckdb.org/
- {duckplyr} A drop-in replacement for dplyr, powered by DuckDB for performance https://duckplyr.tidyverse.org/
- Parquet https://parquet.apache.org/
- DuckDB Extension Template https://github.com/duckdb/extension-template
- DuckDB Labs https://duckdblabs.com/
- DuckDB Foundation https://duckdb.org/foundation/
- MotherDuck https://motherduck.com/
- DuckDB Discord channel https://discord.duckdb.org/
Presented at the 2024 R/Pharma Conference