From Core to Custom: Unlocking new possibilities with DuckDB Extensions
2025/02/10Featuring: ,Understanding DuckDB Extensions: Core and Community Options
DuckDB extensions are powerful add-ons that expand the database's functionality without modifying its core codebase. Extensions fall into two main categories: core extensions maintained by the DuckDB team, and community extensions developed by users.
Core Extensions and Autoload Magic
Core extensions like spatial, httpfs, and parquet come bundled with DuckDB. Previously, users needed to manually install and load each extension. However, the autoload feature introduced around version 0.9 streamlines this process - DuckDB now automatically detects when an extension is needed and loads it transparently.
For example, when querying data from an S3 bucket, DuckDB automatically loads the httpfs extension without requiring manual intervention. This user-first philosophy removes friction from the workflow while maintaining transparency by notifying users which extensions are being loaded.
Community Extensions: Opening the Ecosystem
The community extension repository represents a significant shift in how users can contribute to DuckDB. Instead of navigating the complex process of submitting pull requests to the core repository, developers can now create standalone extensions that integrate seamlessly with DuckDB.
To install a community extension, users simply specify:
Copy code
INSTALL extension_name FROM community;
Building Your Own Extensions
Creating extensions has become remarkably accessible through multiple templates:
- C++ Template: The most comprehensive option, supporting scalar functions, table functions, secrets, and more
- C Template: The newer standard that will likely become the preferred approach
- Rust Template: Currently supports table functions, perfect for specific use cases
The process is straightforward:
- Clone an extension template
- Implement your functionality
- Submit metadata to the community repository
- Automated CI/CD handles building for all platforms
Real-World Extension Examples
Several community extensions demonstrate the ecosystem's potential:
- ClickHouse SQL: Implements over 100 ClickHouse-compatible SQL functions as macros
- pcap: Reads Wireshark packet capture files as tables
- Google Sheets: Provides comprehensive integration including authentication and special data types
- Avro: One of the most downloaded community extensions
The C API Revolution
The new C API represents a paradigm shift for extension development. Despite its name, the C API actually enables developers to create extensions in any language that supports Foreign Function Interface (FFI) - including Python, Go, Zig, and V.
This approach offers several advantages:
- Cross-version compatibility (extensions may work across DuckDB versions)
- Language flexibility without sacrificing performance
- Simplified development process
Future Directions
The extension ecosystem is rapidly evolving with several exciting developments:
- Arrow Flight Integration: Enabling high-performance data transfer between DuckDB instances
- HTTP Server Extensions: Creating REST APIs for DuckDB databases
- Cross-loading Capabilities: Extensions becoming more portable across versions
The community extension model serves as an incubator for innovative features. Popular community extensions may eventually graduate to core extensions, as seen with other successful open-source projects.
Getting Started with Extension Development
For developers interested in creating extensions:
- Start with SQL-only extensions using macros - the simplest entry point
- Browse existing community extensions for inspiration and examples
- Join the DuckDB Discord or GitHub Discussions to connect with other developers
- Use the automated build pipeline - no need to compile for every platform locally
The extension system democratizes contribution to DuckDB, allowing developers to add functionality without the overhead of core development requirements. Whether solving specific use cases or experimenting with new ideas, extensions provide a powerful way to extend DuckDB's capabilities while maintaining its lean, efficient core.
Transcript
0:01n
0:08[Music]
0:30hello everybody and welcome to another episode of quack and cod where we chat you know and code and chats about code and today the topic uh is DB extensions
0:46and we're gonna go a bit understand where we are coming from with DB extension uh reminds also we going to do a recap of the basics so if you're you know new to uh WB already extension word
1:01uh don't worry this is also for you and then we'll see you know what's possible today what has been built uh by the TBT
1:10but most of all by the community around extension and on that I have someone from the community that has been building maybe too much extension during this weekend we're going to talk about that uh in a sec uh Lorenzo that I'm going to add Lorenzo welcome to quack
1:31codes hey mie thank you for having me today it's lovely to be here so uh you been we were just chatting uh before uh the live stream start and you mentioned you've been D using ddb in a while but first tell us a bit about you know your profile your origin and you know when did you uh Ur I
1:56I would say not uh discover ddb but you know start really to think okay this definitely something uh interesting to to work on absolutely so uh my name is Lorenzo and I'm an Italian in Amsterdam I've been living up here for uh close to 20
2:15years almost but I don't speak Dutch so you know I'm one of those uh experts that you know uh stay within the bubble uh I've been my company here is specialized in observability so that that's our real job so what we do on a daily basis is just you know turn Network traffic and telephone traffic into uh data so it doesn't matter where
2:39it comes from it converges into a database uh historically I come from let's say the The Click House Pond so that's where I've uh you know spent the majority and then at some point duck DB came into my life and it changed everything so slowly little by little at the beginning seeing it like you know hey here's an alternative of I don't
2:59want to look into and then slowly being sucked into the Vortex uh I became you know a quacker sure enough and today you know much of my day spent migrating uh my old pipelines and rebuilding our products with dctb that's really the the gft of everything and as part of the Journey of course ended up you know building a
3:20bunch of uh uh side components and extensions and so on so forth but that's the the G I'm building to build nice and can you tell us a bit like in which context you are using ddb because it's like so versatile tool you know people sure there is people there we actually interested about the dougb in the
3:42browser and the web assembly words other more you know in term of pure ETL pipeline so what's what's your main usage today right so we're using essentially duck DB to bring uh telecoms and their analytics into the current millennia uh if you know anything about telecoms you know they're pretty conservative so you know they're always a few steps behind
4:07and when they join you know they're never happy with uh generic Solutions so uh part of our mission is exactly using dub to uh bridge this Gap so we're using it to uh deliver let's say modern data and analytic pipelines for good old telecoms yeah it's it's interesting you mention if I know anything because I started my career actually in Telecom so
4:30I saw your profile yeah it's been really a while but uh yeah I was uh I was working actually on a logging system for a trading room um so it was specifically critical environment and totally hear you that uh I mean when I got my hands out of this word that um yeah it's uh it's tool and technology that are a bit
4:55like in parallel I feel like not behind but like on specific h usage and use case where everybody is now you know as a end consumer using WhatsApp and Google meet and so want to communicate there is an entire word over there uh you know still with Hardware right that's so true that is so true and the interconnection
5:17between those two worlds you know the old one and the new one is always uh complex so that that's where we kick in nice um so let's um so first let maybe a
5:30recap uh for for the people uh you know what is uh ddb extension and actually I
5:37didn't plan this but you know that's that's why it's a it's a it's a it's a live demo I'm gonna share quickly my screen um for a sec and just
5:53um let's go um just launching a DB shell
5:57um so when um for people that doesn't
6:02know right is that um extension there there is two parts of extension right how do you call it the the two parts the one of the community and the other is core I guess core extensions yeah um so the core extension are um under basically uh here if you do ddb extension it's a table function so I
6:27just launched a ddb session on the C right and um this is part of the core extension uh which is basically maintain and and supported by uh the team of ddb and uh and I think what's what's interesting we were just talking about that right is that before I feel for the average user that's starting to use dug
6:50DB they couldn't get around extension they needed to understand and know the principle right because to install an extension so let's say I do spatial here I'm going to install the extension um so it's download the extension and now I can load it to my de session and so if I do again you see that spatial now is set to True right
7:16and that was basically the mechanism before for all the things like you need to read a file over you know S3 you need
7:25it um I think I have let me sure [Music] um I did a bit all right so have for example this one just a public data set um and for that before to make it work you had to install the https exp funion but here actually it is loaded right magic of
7:54autoload yeah so what do you think first about this autoload mechanics do you have any opinion like whether you like it or whether you don't like it I love it I love it and I think it it speaks of you know the the dctb philosophy which always places the user first right so you see it in a lot of aspects and I
8:12think this is one of them and it speaks loud uh you know Innovation doesn't get in in the way so if to load an extension it has dependencies then you have the option to have duv autoresolve and autoload those extensions in the background for you and you don't have to waste a second even figure it out but
8:30it's transparent so as you're showing it tells you right so nothing is like a mystery it just like does everything it can not to get in the way so the focus is still you know running the queries being fast and it's beautiful how they managed to squeeze such a complex thing as you know loadable extensions I mean
8:49you know we now we're talking about it as if it's something just yeah but you know it's one of the most complex things ever to realize and Implement and they did it for us so I think you know it speaks super loud and and I love features such as autoload and auto find because you know they just effectively
9:04save human time no I think I think it's great I think for me it's just like the challenge as you know educator it's just that now I kind of like need to explain the things and then say but most of the time actually you don't care so except if you're using you know Community extension and so on but it's true but
9:22that's the that's the magic that's the Simplicity uh you know of the ddb because here we mention ttps extension actually there is the parket extension which has been loaded right so you can see it here and if I go up oh yeah it
9:39was already there because probably um my sension was open um but so yes so that the point is um autoload features came I
9:50think in 0.9 right so a bit more than a year ago or almost between 0.9 and 1.1
9:581.0 um and enables just a workflow even more
10:04seamless but that works for uh you know uh core extension um and so now uh
10:13that's that's the kind of a topic today so now we've seen the the basic basically around uh extension I'm just coming back to to my notes um so what actually have you been working on because there is um the possibility to create your own extension um and so you started to create uh a couple of them why why so
10:39much well before that I tried to make a PR so my initial goal uh which is still sitting there uh was to add right support for the uh httpfs uh uh
10:53extension core extension let's call it uh which now lives in its own Repository so it has since been let's say moved out of the main codebase and I just want to add write support so that you could just take a you know database file and actually write it somewhere which uh you know is quite Niche but it's part of
11:10what I wanted to do uh so I made the pr and uh as part of that I to you know learn how to get into the codebase and you know dub is very elegant and it's made by really good people so uh as much as it's easy to read the code base it's quite intimidating to go and put your
11:26hands there because you know you're dealing with some of the best so you know uh if you're not at that level that's already first you know intimidation second when you do something selfish like I was doing you know you got to meet the the quarum right so dub is not bloated that's the beauty of it I mean 30 what is it 33
11:45megabytes uh amazing I still every day I wake up and I wonder how did I manage to squeeze all of that stuff in it and uh just just a side yeah just a side note I think I mean I haven't looked we could look at um you know their ratio of the
12:03top comput top contributor but I know like on on you know project that start to be large like um dougb the main contributor actually remove more code then need add it those are the great those are the good ones uh because the thing is I I think it's it's not the case on yet I think it's when you have an explosion of
12:26contributor so on Apache spark for example I know that my T for example has a ratio which is negative um right which is uh which is makes totally sense it's like exactly to just do your point it's like if you if you have code that's going to a core you you know need to open the scope not just your your scope
12:48right um and the requirements are are really hard so yeah so what happens due to that was my lesson right so yeah I go in you know after doing the I figured that you know uh it's going to be really really difficult to implement my hacks and ideas as PRS introductive they don't deserve to be there but luckily you know
13:10you turn the page and you discover that they already thought of that so that that's how I got into even looking into extensions and right when I was doing that uh Community extensions came out so I was there you know the first day uh when it appeared and uh I jumped right on it so I uh you know when it appeared
13:28in front of me I said okay I'm GNA spend the next couple of months just diving as deep as I can into it and that's really how I started so the first job was just you know taking the stuff that I was trying to do in the pr and moving into an extension then I got carried away and
13:42today I have like a dozen uh going on but the first one was the I think the uh csql so actually thank you MDU it was inspired by Alex uh who made the SQL on the extension template and that's really uh my first step into the topic so I made an SQL only extension so extending activ be with macros essenti essentially
14:05but instead of loading them as you know the manual way create M mro and whatever you basically stuff them into an extension I think this is the entry point for anyone so if you want to like discover how to make a a du DB extension take a macro and do an SQL only uh extension as the first step because it's
14:22just going to tell you how the mechanism work Works without any of the complexity of the code so essentially you can SQL and then just make it loadable so that was the first one csql then can we can we have a quick look on this one so uh sure which which uh so this is your uh organization qu right P science just
14:46uh this is the excuse right so uh I love
14:50uh open source and I love Community work I don't like working by myself I love to have you know other crazy people around that have crazy ideas so qu science is basically a container for crazy people like me that want to you know invest or waste time together to or the
15:08hard that's a good way to summarize you know how open source can be sometimes you know oh yeah yeah yeah there's a lot of yeah for every good idea there's at least 20 30 that get you know thrown out the window but that's how I love experiencing open source so yeah this is a container so you'll find here all of
15:27our extensions but also a few ones have been you know contributed and moved over by other parties so uh The Click house one so it's yeah the uh if you if you just right click house should be two of them that appear is this this one uh no
15:43uh just type click house yeah so this there you go so yes that's the first one click house SQL so this is essentially a macro extension that implements uh a little over 100 um click house macro
15:59function so it basically reimplements uh you know some of the SQL syntax that you find in Click house uh into and this is what I used to get started because of you know 100% of my pip plans were made for click house at the time and I just wanted to Port them over without rewriting all of it and without forcing
16:19all of my guys to learn you know all the new stuff and uh this is how it started yeah so um so first just as a
16:28remind um is that you if you want to uh install
16:33and load a community extension which is the case here um you have to specify the keyword from and Community uh when it is
16:43published to you know um the
16:48community um repository umre and so there is the website I'm not sharing let me share it uh there's a link I think in this repository if you go up the the link of the repo should take you right there uh but yeah this is another miracle by the dub Labs team so not only they give us the ability Community exensions but they
17:09gave us uh a way to publish them and uh pipelines to build them so literally the developer doesn't really have to do that much you just write the code and then you submit it to the community repository where there's a fantastic action chain Builder that essentially tries to build the extension for all platforms architectures or whatever so
17:28it helps you find out also if you forgot something if there's an incompatibility or whatever you don't have to do it locally you just for instance I only develop for Linux and then uh when I go into the build pipeline I find out if you know the Mac version the windows version whatever have you know their own
17:44little needs but it reduces the job to 1% of what you would normally have to do if you had to try it locally so literally a miracle of Technology uh that couldn't make it any easier yeah so um and by the way this is this was a thing I've been requesting for a while I mean no it's it's been there for for a
18:04while but um uh when I was starting using dgdb 0.7 it was there was extension and um and maybe yes should just to come back on the timeline so we talked about how you know extension has been autoloaded on core um and then they they create this extension uh Community extension repository where you can submit and there is know CI so the way
18:30it works is that you have um maybe I can show uh the community extension uh repository right y but that's the yeah let's let's
18:43actually show this example so once you you have your extension ready this is basically what you provide right correct correct just metadata pointer at the uh revision and the remote G repository where you have your code uh documentation read me it's all in one and essentially here you specify the version that you're building for the platforms that you might want to exclude
19:07it's all in one so one single file does the entire job yeah and so um and so for
19:14example this uh extension um I'm training um
19:21template uh so this is a r one so DB
19:26also provided uh an extension template basically to kind of get there um and um
19:34and for that uh so the only thing you need to do if we go I think on
19:44the on the code here you have uh a simple example uh on this uh on on this
19:52extension template to show you how you can start with with an extension uh so it's basically just a a quack function here right right um and uh and yeah you use the template clone this one um and you are ready to go to build on your own and then when you're ready you go to the community extension repository and you
20:15submit um the you you submit the uh the the the pr
20:22for the metadata that's that's that's rly right uh yeah that's correct uh essentially the that's actually three active templates so we have the C++ extension template the new C extension template which is the one that I guess everything is going to be built on top of in the future and then there a rust one even so you know one can even choose
20:44you know whatever they're more uh familiar or less intimidated by and they all offer the quack function actually if you look on GitHub there's also quck function in go there's a quck function in Zig and a few more that I don't remember now so people have been quite active but let's say these are the official ones that you can actually
21:03publish to the community repository so that's the only difference is basically the build Pipeline and I think over the next months we're going to see a lot of those builds so I expect that a range of you know languages that are going to be allowed into the community repository is gonna grow exponentially yeah so that's an interesting thing uh also
21:23to kind of uh look back is that uh we uh
21:28used you have you know building an extension is basically uh mostly on uh C++ right here because up until now yeah
21:38yeah yeah B in C++ uh there were there is the r templates but there is you know a bit of rough Edge can you tell us a bit like what's what is absolutely I uh I took the risk there so I was talking to uh Sam and whatever so the the rust template uh and shout out to
21:57Sam by the way for absolutely and car H
22:01yeah yeah both of those guys are you know uh the reason why all of this exist and why we can work on it so you guys are amazing uh the the rust uh extension template is a little bit uh behind I think it's catching up pretty quick uh but essentially only offers a table function today so the the C++ one has a
22:21bunch of uh examples you can do a scalar function table function uh create secrets you can literally do everything you want all you have to do is look into the original code and you can do it in the r one is not quite there yet uh but it's good for some stuff so I have two or three extensions uh in Rust where I
22:40only needed a table function one is for reading pickup files uh the other one implements uh um continuous profiling so it's a pyroscope extension for docdb that you can just load up and uh it starts tracing whatever you do and sending it to either a POF local or remote file and one that's done for well for a native
23:02click housee client so basically table function has a native binary click house client where they we can use to move data quickly from that platform and for those I didn't need anything else so it's actually usable if the scope of the extension you have in mind is essentially a table function which is a lot I mean it sounds like it's nothing
23:20but I would say 90% of things are uh in know function yeah so here is just an example on the uh your rest uh extension uh which is to read peap what's what kind of P file is peap actually is a oh yeah it's a network
23:42packet wark yeah War shark yeah I know know yeah yeah of course so with this you can essentially read a wire shark file and display capture packets as as if they were in a table and of course you can take stuff out of them into a and is still heavily used in the Telecom word right oh yeah absolutely in in in
24:04the world in general Telecom is just probably not even 10% of its usage for you know network security everything uh so yeah that's one of the things that we needed and I uh you know I think it took me less than a weekend to make and mind I don't know any rust so this is done out of uh just uh how simple it is you
24:25basically find the right crate you make it work you you know bang your head against the screen for a few hours and there you have it so uh totally not intimidating even if this is a message that I want to give out I mean don't be intimidated this is accessible to anyone because the the scaffolding is so good that you know
24:44you can afford uh spending the entire time just learning the one thing that you want to do with it uh without being forced into you know actually learning or understanding duck DB uh all at once you will understand it progressively as you try things so anybody literally with you know uh an AI sidekick and a good idea can make an extension yeah and so
25:08yeah that's that's a fair point that like a lot of people um and there was a question actually during the duckon um you know python is not my jam how can we you know uh contribute uh more easily to extension and that's a really fair point um it's true that uh you know dgdb is building C++ in data especially a lot of
25:31people knows around python yeah sure it's not efficient but I think you know when you do simple think uh you know that's where the the power I think it is now you have you know AI Cal to help you to do simple thing um but what what's the step further like we talked about the rest uh template which kind of like
25:52simplify also um yeah you don't need C++ or C code uh not the no theb build required um but what would be like now there is the cap API can you tell us about about that uh well you know I don't know if I'm the best I can tell you from my perspective uh I think this is the uh
26:16where it becomes being not experimental anymore so until now you know with a C++ one it was kind of a you know uh the native way but not good for our languages the r one was kind of a disc y I think Direction and this one I think is what we're going to see everything bu be being built on top of so this is
26:35where you know we actually allow any other language that can uh do ffi to build an extension uh so already you can see the zig1 I think uh makes a heavy usage of uh of this new uh this new template and this new library so this is where I think it's going so for us maybe bad news we're going to have to rewrite
26:57a lot of our EXT iions and we report them into the new way but for users it's all going to be the for the best and I believe this is potentially also going to mean that we can have uh uh cross loading so up until now uh builds of extensions are locked into the version that they're build for so extension for
27:16uh one to Zer is not going to work on 114 and vice versa and I think from now we one to Zer and this template potentially this becomes uh even more portable so more like you know the postgress extensions or whatever so it's all for the best I think uh it shows the commitment of ddb labs into delivering
27:38this as a long-term solution yeah yeah no so um yeah so to recap I think what we're going to see basically is wrapper around the Capi uh with other bindings for people to right yeah to create because uh simply there is a lot of existing uh library to that integrate well with the CPI no matter the language
28:02you're using right exactly so that's exactly and that's where we go yeah even if and mind you know many people think up but I don't want to do it in C so actually the C1 means that you don't have to do it in C it's funny how it's reverse right so yes that's why I want yeah that's a really good call out yeah
28:23right so you know for for the listeners this doesn't mean that you have to work in C at all it just means that you can use whatever you want because the C binding makes it possible so again I my
28:36one of my plans is to make an extension in Vang for instance which is uh a nice
28:42programming language that's made here in amstr as well uh super rare super weird but you know when you can make it in v it means that the integration is fantastic so that's going to be for me you know the testing Bridge nice uh testing on on on a real language um yeah cool it's um so that that is great so this is still really
29:04recent right um Absolutely Fresh in fresh ink so you know we're all new by the way this is also I think something that's uh uh inviting everybody's new this is new stuff so it's not one of those things where you walk in and you feel like you're years behind at the maximum you're a few weeks behind the
29:21next guy uh so everybody's an expert and everybody is a Newby at the same time uh which you know makes it easier is not intimidating like you know if you going and try to do the same into uh some other projects uh you know uh you're probably going to give up before you even see the results here is the
29:38opposite first you get to play with it and get some results and then you know you can make it better and better and better and I have to say you're you're part of it and part of the reason but you know our community with DB is super strong so people come in with the right attitude uh we're all Builders we're all
29:55uh you know in love anyway with you using it so that's a really good vibe and uh you know it it doesn't feel uh uh hard to find another crazy guy that's gonna help you figure out that little thing especially for instance on the Discord I think both the the mod duck one and the duck DV Labs one uh it's so
30:13full of uh amazing people that are always ready to jump in you know with a good idea award an example link a repository it's like endless so yeah no
30:23but that's I I think what um how I see it at least for from the data analytics point of view is that we had a lot of
30:33um um I would say close Source Eder database where they have you know specific function and so on but it's not really you know open where you can easily contribute and I think as you mentioned it's like it's hard to contribute to the dgdb core um and I think from a responsibility point of view we talked about that uh you know
30:54once the pr is merged uh on a repository like dub then it's not your problem anymore you know it's theirs and so that's why that's why the level of um you know approval can be painful but at the same times you don't want to you know prevent as you mentioned hackers that has small ID and they you know
31:15sometimes valuable I think uh another one I can give a shout out is uh the Google sheet extension oh Arie Yeah by Archie let me let me let me tell you this everybody has to know if you want to build a community extension and you want a full example this is it so this is the one extension that you load it and it's got
31:37everything uh Secrets connection external links uh authentication uh special data types special handlers uh scolar table functions it's everything so if you take this as a reference you can build uh any extension that you can think of and you know uh Big Ups to Archie for building something uh so and I think Alex participates as well so
32:01you know quite an amazing cooperation there no but I think yeah that's that's the point is that it it is like sometimes you I remember when you started uh the identification was not there or not as it is like now with the secrets um and it's just you you you can start really Scrappy but that's so you
32:22know evolve over time which you cannot afford that if you push something to the core right right right Absolut and so and so there is like a separation of ownership which is great from uh theb Foundation but for the community extension uh Community part is that they can you know contribute to DB by you
32:43know cre the separate extension and we can see also DB core start to be split into you know some stuff to extension which make it even if it's a core extension then it's much easier you know to contribute than itself and I would add something all of this without uh preventing you to plug into the car so I can make my own
33:07little extension here at home load it up load the mck extension and I'm playing with real data so yeah even if you keep it local I mean now we're talking Community extension but even if you don't publish it let's say that you make an extension for yourself for your company for your own purpose it still plugs into the full ecosystem uh which
33:26is quite amazing right so if you have you know your production uh data sets on mod Dock and you develop your little extension magically you're already there all you have to do is load basically a two extensions in your own ecosystem and off you go which is something that I don't think exists anywhere else yeah yeah so I think um I think this is
33:46really interesting for for the future of Doug DB and you know how theity is uh contributing and by the way I couldn't find the page again on the on the documentation website um sorry Gabor I don't want to to point it you I think it's one I I I've seen this page uh I'm not sure uh if it's um
34:12no it's not on the code and con it's for sure it's probably on somewhere on the internal but there is a a page where they mention uh their guideline uh to uh basically contribute to theb and they say first you know can you do it within extension right right yeah I think it's on the GitHub repository into
34:33thebly yeah yeah but yeah you're absolutely right yeah yeah yeah and I think again it shows how smart they are because you know the dub Labs team is as smart as realistic I think uh so they know that uh you know quality has a cost and you cannot just do everything uh all the time for everyone so I think this is
34:53also a way to show like you know that they they're not going to slow the user down even when uh you know direct capacity with this approach yeah yeah yeah so um no I think that's a that's an interesting model and just to reflect back on the on the data word you had um I think you know we talked about
35:13properity service but we had like Open Source service like I think kfka works pretty well where you had kfka connector and a lot of company basically you know created their own connector and I think also uh the last thing I want to mention is that your community extension could be you know a core extension one day you
35:34know who knows I think they serve also as an incubator where uh basically if something you know get a lot of popularity you know suddenly um they get you know get back they move the ownership I'm not saying this going to happen but we've seen it happening um I remember with Kafka actually the big query extension was you
35:55know a company ex extension like like you people people building it for internal usage pesing it and then uh oh by the way actually a lot of people are requesting this um I think sometimes even making something uh imperfect helps in the same way right so making a good extension that proves the point that you know the community wants
36:18something is you know even in a worst case scenario going to you know push somebody else to make a better version of it or somebody in the core team to find a better solution right so it's all about proving the point of having a need out there and then uh somebody smart is going to make the the best version right
36:36yeah um and I think uh what do you think I think there is the stats I don't remember uh I think I have what is the most downloaded uh Community extension I think the the te the Google sheet is probably the first one uh you know I'm not sure uh that's a community uh site that's tracking all of
37:01the statistics uh plugin by extension by extension uh I forgot the name of it but yeah the git one is one of the most popular for sure uh but I think the um the Avro one right now is uh the the real rocket so if you look at those statistics that one I think is at the very top it just came out and I think
37:21there was a lot of people waiting for it so uh probably at the very top is that one yeah and uh yes wow you were right so look um
37:34there there you go uh and that's not the right screen so let me uh share so uh actually ddb um
37:44have their stats downloads available uh over here um oh you do need the proper way oh yeah so uh so you can see the last week download that's the the community extension and so we have AO exactly crushing it de times oh yeah from the
38:07second at then uh what is block talk is it you uh I'm sorry what is it what do you know what's what extension is block dock I don't know I don't know this one uh okay now that's that's uh that's interesting so yeah yeah so um yeah so it's it's definitely um interesting to see uh to grow of uh of each extension they can
38:39serve as we talked as incubator or giving IDs for other people to do it better I think the big query one there was two people actually that started the extension I remember um and just you know one got you know a bit better approach and so that you know that works it's like exactly oh yeah it's exactly
38:59as you said at the beginning it's like sometimes you do great work together sometimes you waste a bit your time but at the end if you know everybody is getting somewhere that's uh that's the most uh that's the most important right I think this is next for the pogress duck DB extensions as well I think right now there's like a you know a huge
39:17variety of approaches into uh merging duck DB and pogress so I'm also waiting on that scenario to see you know what merges what wins it's very exciting yeah
39:28so what what do you think uh is the most extension which the you know extension which is missing could you give like some ideas to people where do you think people where do you usually get your ideas for extension I mean aside from you know I need it let's say people are just interested to uh around at least a couple come from
39:52somebody asking a question on Discord or uh you know in the community say ah I wish I could and then you're like oh right why couldn't we like for instance the web macro uh extension was one such thing so there was a bunch of people exchanging macros and then you have to paste everything and it's hard to manage
40:09so I said oh why don't we make a function that just loads it from a gist uh and make it super easy to do it uh or uh the Chown job extension like there was no Chown job extension only one that could calculate things and somebody was discussing on Discord and it was made over a couple of hours so yeah for me I
40:27mean again it's all about uh fun filling a gap helping someone uh and in general building uh building blocks so I like the Lego approach uh that this creates where you know people can come in they can either find the extension they want or they can use it uh as a template to to make their own so as long as
40:49creativity you know can run wild uh then it's working another another place that you can people can also look if they want to get ins Spar is the G up discussion there is a lot of things happening so you mentioned the Discord right the Discord is kind of like uh I would
41:09say uh daily things that could happen but if you're not like you know watching everything every threads uh you know G up is actually also a great place to have more structured place so true so true and here as you can see that um one of the top requests is uh XML I have seen at least three uh XML
41:33extensions on gethub coming so that's three separate developers making one right now today and I can't wait to see which one makes it to the community rep first yeah that's that's the theistic one like who is first uh but yeah so that also should give you a couple of ideas because there is a lot of discussion about you know uh specific
41:56feature requests but I uh sometimes um some feature can just you know be part of a part of an extension um and on top of that the team is going um the ddb team is going here and there sometimes to to give their thoughts or uh they view I think um it's not like a topic of uh uh external table
42:20but here there was discussion about you know how to handle catalog and so on and you see that's that's a that's really long tread um so I think there is um a lot of nice information you have to sort it out but here just for uh for information even for any other GitHub repository if there is discussion um on
42:40GitHub which is active you just do top all and that's going to rank uh basically you know on the top voted uh discussion and you can do it uh you see for example in the past month we have in filter push down is finary here uh Park Evolution yeah other stuff um what are you the most excited for so we talked
43:03about the c um uh cap API for extension
43:09that enables you know other programming language um what else what is the the the the next thing that you're mostly interested into the extension ecosystem uh flight and the airport extension so I don't know you you for sure you saw the presentation yeah I was there yeah uh yeah so I'm uh I'm trying to be one of the first to
43:35add the other side of it so the airport you know extension gives you a client uh and uh what we're trying to do super super fast is to add the you know the server part uh you know the HTTP extension uh HTTP server extension already adds basically an API an HTTP API to a a duck DB instance so that you
43:56can query it from the outside and now it has to be done with flight so that's the next challenge is to make the flight server part so that you can uh use flight to talk and you know insert read data between instances with concurrency with parallelism with all of the nice things that everybody wants and then of
44:17course it's going to be parket files on the other side so that's that's where my attention is right now is uh you know adding flight everywhere where it makes sense yeah no but that's uh I think it's a it's an interesting um way so yes so just the tldr is that basically you have a web server as do kind of a intermediate
44:39right um uh yeah yeah yeah it's uh yeah essentially it's a uh grpc service really uh on top of HTTP 2 ideally uh
44:52that we just use basically to interact with the API using Arrow as the native format so it's the most optimal let's say from you know both uh you know being cheap at compute memory we don't have to um modify let's say the the format as we move it from memory to dis and back you know has all of the advantages that we
45:11know and especially if you're working with parket files as the target for storage it's like magic so it takes a lot of the overhead away and uh I think this is going to be the most popular topic this year is going to be flight extensions on duct TB that's my prediction so cool let's see let's let's talk uh
45:30again in a year 10 to uh to double check this I think uh yeah for me uh what I I I found it interesting is that I'm wondering if we could uh create an extension uh in the same line of the macro that you mentioned but really a bit more generic um so that anyone that want to had basically a SQL function you
45:53know have to do a PR I think it's a bit the let's do it on a live stream yeah no we we could do that but I I was wondering basically because uh I think that's the beauty of an open source database is that you could you know have recurrent function just public and people optimizing this like I'm I'm
46:16just and there is like specific function also that um or like I would say SQL
46:23structure that you're going to use for example for D duplication right there's multiple way to do it but if you could just do like a simple function that you call and you give you know the key you want to duplicate um these kind of things and you know behind is just pure uh pure SQL so that's I I I think you're you're
46:44right I mean all it would take is a community repository where to you know do this things I I have a um kind of a mock of this thing using uh GitHub issues where you can basically open an issue with a function and then uh load it but you know it would take something better but the idea is all there and I
47:04would love to you know hear more and uh to make it happen together if there yeah so yeah so we can we can take a step forward but for the audience so basically you uh you understand it now how hopefully uh you know extension in ddb is powerful it is the first uh step for you if you want to contribute uh to
47:25ddb it's easy because basically you're on your Island you're on your playground no one is going to disturb you or block you um there is people to support you uh either you know on the Deb repository on the discussion or on the Discord um and you have a tons of community extension uh that we show on the community
47:48extension website so you can um actually also uh go there let me just uh highlight this uh for or Amendment so if you go there you're going to have example on uh different uh Community extension already built um but again uh the future is looking uh promising with the CI so this just going to make it easier for you and without any excuse to
48:14you know uh contribute back to the things uh Lorenzo thank you again for joining is there anything you you wanted to add or plug in to the audience no all just thank you so much and uh I just want to make myself available so if you uh you know need a starting point or anything I'm uh I'm
48:33open so you can find me on all the various platforms on GitHub or on Discord of uh both mod Dock and duck dbab So for anybody that's interested feel free to reach out and mie thank you so much for this great opportunity to talk about this no what fun so I'll put your link in there and it will be also
48:52uh on the YouTube description uh your your contact and uh have a great end of day and I'll see you in the next one thank you so much
Related Videos

2026-01-27
Preparing Your Data Warehouse for AI: Let Your Agents Cook
Jacob and Jerel from MotherDuck showcase practical ways to optimize your data warehouse for AI-powered SQL generation. Through rigorous testing with the Bird benchmark, they demonstrate that text-to-SQL accuracy can jump from 30% to 74% by enriching your database with the right metadata.
AI, ML and LLMs
SQL
MotherDuck Features
Stream
Tutorial
2026-01-21
The MCP Sessions - Vol 2: Supply Chain Analytics
Jacob and Alex from MotherDuck query data using the MotherDuck MCP. Watch as they analyze 180,000 rows of shipment data through conversational AI, uncovering late delivery patterns, profitability insights, and operational trends with no SQL required!
Stream
AI, ML and LLMs
MotherDuck Features
SQL
BI & Visualization
Tutorial

0:09:18
2026-01-21
No More Writing SQL for Quick Analysis
Learn how to use the MotherDuck MCP server with Claude to analyze data using natural language—no SQL required. This text-to-SQL tutorial shows how AI data analysis works with the Model Context Protocol (MCP), letting you query databases, Parquet files on S3, and even public APIs just by asking questions in plain English.
YouTube
Tutorial
AI

