DuckDB’s become a favourite data-handling tool of mine, simply because it does so many small things well. It can read and write a huge number of data formats; it can infer schemas automatically when you just want to move quickly; and it can interface with most languages, run like lightning on the desktop or be embedded into a webpage. I’m a huge fan.
But I’m not nearly as knowledgeable as this week’s two fans, Simon Aubury and Ned Letcher, who’ve just written a book on all the many ways you can use DuckDB and all the hidden tricks and tips that help you make the most of this. So in this episode we’re taking a practical look at DuckDB, what problems it can solve at work, and how to start getting the most out of it.
0:00 Intro
4:12 What is DuckDB used for?
6:50 DuckDB for data wrangling
10:31 DuckDB’s support for Parquet
13:42 Parquet’s Predicate Pushdown
17:58 HTTP Range Requests
19:54 DuckDB as a Deploy-Anywhere Database
21:47 BYO Query Engine
29:14 DuckDB’s Extensions
32:20 Working with R
39:44 Is the Cloud Age Breaking Databases into Pieces?
41:26 DuckDB and Python
44:53 Embedding DuckDB with WASM
51:47 A Few Speculative Use-Cases
53:15 DuckDB for Edge Processing
56:32 The Reality of Co-Writing a Book
1:06:35 Outro