A new paradigm for data visualization with just SQL + Markdown
2024/09/24Featuring: ,Traditional, GUI-based Business Intelligence (BI) tools are excellent for creating initial dashboards, but their "drag-and-drop" workflows often create significant long-term challenges. As data assets grow, maintenance becomes a bottleneck, version control is nearly impossible, and customization is limited, leading to a world where data professionals can spend up to 90% of their time on maintenance rather than new analysis. A modern approach, known as "BI as code," addresses these challenges by applying software engineering principles to analytics. This article explores how to use Evidence.dev, an open-source framework, and DuckDB to build maintainable, high-performance data apps using just SQL and Markdown. This combination frees data teams from tedious maintenance and unlocks deeper customization, transforming brittle dashboards into robust data applications.
Why Traditional BI Workflows Break Down at Scale
The core issue with many BI tools is that the final output, a dashboard, is a software asset that is not treated like one. The creation process involves manual clicks and configurations within a proprietary user interface, which is not only tedious but also error-prone. When a metric definition changes or a new data source is added, an analyst must manually update every affected report. A code-based workflow fundamentally changes this dynamic. By defining reports and visualizations in code, teams can track every change in Git, providing a complete history and the ability to revert to previous versions. New logic can be submitted through pull requests for peer review, and teams can integrate automated testing into a CI/CD pipeline to ensure updates do not break existing reports. This approach makes it simple to find and replace logic across an entire project, drastically reducing maintenance time and bringing the rigor of production software to the world of analytics.
Building Data Apps with the Tools You Already Know: SQL and Markdown
Evidence is an open-source framework designed to abstract away the complexities of web development, allowing data practitioners to focus on their core competencies. Instead of requiring knowledge of JavaScript, HTML, and CSS, Evidence enables users to build sophisticated data apps using only SQL queries and Markdown files. The workflow is straightforward. An analyst first defines their data sources, such as databases like DuckDB, Postgres, or modern cloud data warehouses like MotherDuck and Snowflake. Next, they write SQL queries to fetch and shape the data, which can be stored in dedicated .sql files or embedded directly within pages. Finally, they compose pages using Markdown syntax, embedding pre-built Evidence components like charts and tables that reference the SQL query results. This approach dramatically lowers the barrier to entry for creating custom, narrative-driven reports. This simple, code-based workflow also enables a highly efficient local development experience, allowing analysts to iterate on their work at speed.
Achieving Flow State: Fast, Local Development for Analytics
One of the most powerful aspects of the "BI as code" workflow is the local development experience. Traditional BI often requires a slow, server-dependent feedback loop. With Evidence, developers run a local server that provides instant updates as they write code. This tight feedback loop is a core feature of the development process. For instance, when a developer modifies a Markdown file to change the number format on a value component, the web page refreshes instantly to reflect the change. This immediate responsiveness allows developers to stay in a state of flow, making development faster, more enjoyable, and more productive. The project structure is organized for clarity, with a sources directory for data source configurations and a pages directory for the Markdown files that define the application's content. This clean separation of concerns makes projects easy to navigate and maintain.
Powering High-Performance Analytics with DuckDB
DuckDB plays two critical and distinct roles within the Evidence ecosystem, enabling both high-performance data processing and rich client-side interactivity.
First, Evidence supports DuckDB as a primary server-side data source. Users can connect their projects to local .duckdb database files or read directly from collections of Parquet and CSV files. During the build process, Evidence runs queries against this DuckDB source to fetch the data needed for the static site.
Second, and more innovatively, Evidence ships a DuckDB engine to the browser using WebAssembly (WASM). This in-browser OLAP engine unlocks powerful capabilities without requiring a round trip to a server. When a user interacts with a filter or a dropdown on a page, the query is executed by DuckDB WASM directly on the client's machine, providing lightning-fast responses. This architecture also makes it possible to perform joins and aggregations across data that originated from completely different sources, for instance, combining data from a Postgres database and a BigQuery table on the fly.
Designing for Insight: Customization and Data Storytelling
A code-based approach unlocks a level of customization and narrative depth that is difficult to achieve in GUI-based tools. Because reports are just text files, it becomes simple to inline context, definitions, and commentary directly alongside data visualizations, solving the common problem of documentation living in a separate location. This flexibility allows for a wide range of applications. For example, a "North Star Report" can be enhanced by adding target lines, colored zones, and detailed explanations to charts, creating a clear narrative for stakeholders. At the other end of the spectrum, a highly customized football statistics site built with Evidence can look less like a traditional dashboard and more like a bespoke data web application. Moving to code does not mean sacrificing visual polish; instead, it provides the control needed to build truly tailored experiences.
From Localhost to Production: Deploying and Sharing Your Data App
The deployment model for Evidence is another key differentiator. When a project is built, Evidence generates a static website consisting of HTML, CSS, and JavaScript files, which can be hosted on any modern serverless platform like Vercel, Netlify, or AWS S3. Because there is no active server or live database connection required to serve the application, this model is inherently secure, scalable, and cost-effective. A CI/CD pipeline handles data refreshes by triggering scheduled rebuilds, which can run daily after upstream data pipelines complete or every few minutes for lower latency.
A powerful pattern is exemplified by the DuckDBstats.com project, which combines Evidence with MotherDuck. The site is a static Evidence application that visualizes PyPI download statistics stored in MotherDuck. Crucially, the site also includes a MotherDuck share link, allowing users to not only view the curated report but also gain direct, queryable access to the underlying raw dataset using any DuckDB client. This approach powerfully combines polished data presentation with open data access, enabling deeper exploration for interested users.
The Next Generation of Business Intelligence
The shift toward "BI as code" represents a maturation of the analytics engineering field. By adopting principles from software development, data teams can build more reliable, scalable, and insightful data products. Tools like Evidence, powered by the performance and flexibility of DuckDB, are lowering the barrier to this new paradigm. The future of business intelligence is not just about visualizing data, but about building interactive, maintainable, and trustworthy data applications that organizations can depend on.
Transcript
0:00oh [Music]
0:31he
0:37[Music] [Applause] [Music]
1:06[Music]
1:22he [Music]
1:50hello everybody and welcome to another episode of uh quack and code where we do discussion and also code around uh and in this episode I have uh someone I was wondering why is not been there yet on this live because I've been uh talking quite a lot about uh evidence which GNA be the the topic of today but just in
2:15general of bi as a code tool and how it's changing but uh if that doesn't really you know sounds familiar to you the bi as code tool don't worry we'll dive into that and so I have Arie from evidence which is there so hello hey so excited to be here yeah how how are you it's been uh I
2:39think it's been a while uh that we met in person I think the last time was at the DBT coals last year right yeah yeah
2:50I've been doing well um I I've actually had a had a small kid since then so um life has been uh pretty new uh sleeping a little less than than I might otherwise but otherwise it's been it's been really great but yeah I'm I'm really excited to join I feel like uh this has been a long time on the
3:07cards yeah cool um no so you you uh yeah that's endling everything uh parent uh parents live and uh dashboard live so be
3:19tell us a bit like I'm curious because I actually don't know the story myself uh on uh your evidence like really the starting story of evidence and how how did you come and and join uh join the company before diving into what it is exactly yeah so I um was actually um
3:41looking at starting my own business um I moved to Canada um which is where I live now maybe um about three years ago and um I was trying to start my own business uh in in kind of the data space at the time um but that didn't really go anywhere and when I was trying to work out what I wanted to do next um I was
3:58sort of investigating a few different ideas around particularly around you know data visualization and how you could do better data storytelling um and then I came across these two guys um Adam and sha who um I when I saw what they had done I was like ah this is part like this is what I should have been
4:17doing you know um so um I talked to them and you know uh one thing led to another and I uh I ended up working there but um I was just really impressed with their whole approach to you know know um they were taking to data visualization and and and bringing metrics and data to companies and I felt at least from my
4:38experience before so I used to run a a bi team at a e-commerce company um a lot of the things that I felt were missing um in the experience there you know the lack of customization the ability to of add commentary to um to the reports as you're making them to explain what's going on all that kind of stuff this is
5:00stuff that they were thinking about really really carefully with evidence um so I was really excited to Jo yeah and I'm care what's what's exactly your background you were talking that you were running a bi team uh are you like do you define more yourself on the business side because like you know evidence is kind of like more on the
5:21left software like more technical people right how do you define like your background and your persona in the data space yeah um I I mean I feel like I have a slightly unusual background but actually I think data is one of those spaces where lots of people come at it from all kinds of different angles so it's I'm
5:41kind of typical in the sense that I'm a typical um so I started my career as a Management Consultant so I was making lots of PowerPoint slides basically um you know and that's a quite a different um experience but you're still making a lot of charts and so from that experience I bring like quite strong opinions about
6:01like what makes a really clean understandable chart for like a business person um but then kind of the thing that I did after that was I moved into an e-commerce company called patch plants uh based in London UK and um I
6:17started there as as the chief of staff so not explicitly focusing on data but um was always having to like talk to the data team because we always trying to make decisions you know based on the data we had and often we didn't the data we needed to make the decisions and so part of what I was doing was like trying
6:34to lead the processes where we we'd bring that data into our into our stack into sort of snowflake look at that kind of thing um and we were big users of DVT
6:44there as well and I kind of in the end kind of was just getting closer and closer to that team um and then eventually we um ended up having a bit of like a reshuffle of our of our management team and I started running that team as well um because it was something I was really interested in I
7:00felt like I could add a lot with my experience um so it's a lot long I learned SQL while I was there and you know I got increasingly into you know using VI tools like looker um but I am not you know a data engineer or anything by training um any SQL and and python I picked up has has basically been on the
7:21side um and so I guess I'd consider
7:25myself sort of a a bit of like a data generalist um without without better word um yeah and you know since I've been at evidence you know I've kind of been continuing that Journey yeah it totally resonate with me
7:41saying that we in data people come from inusual background and not to flattering you but that's usually the best person I've met in the data space I don't know I feel like uh as a data person we are tied
7:59with different stakeholder different case would it be technical or business and I think this versatile background right not from typically computer science uh helps you I think understanding different perspective from uh different people and their different needs uh but let's di uh because already people are in the chat and they ask like what is evidence uh we have someone that
8:23say I ask oh if they heard of it and someone say yes in your previous bi has code video uh so that we did on modu channel so that's great uh we had a couple of people that haven't haven't heard of it and they are um and uh somebody else it is safe don't
8:42worry evidence. deev is a real company all right uh so let's let let's a bit dive in and what is uh evidence exactly maybe Archie you can can tell us yeah so
8:56evidence is um it's a it's a tool to make data apps uh using markdown and SQL so um you hook up your database uh and then you write uh SQL queries um and then you uh ingest those SQL queries in in sort of in markdown components and then the output you get is kind of a website or web app um and that's in
9:20essence kind of all it is and you know under the hood we have sort of some pretty interesting stuff going on but um it's perfect if you are um trying to build uh data reporting um but you're looking for something that's um you're looking for something that's pretty lightweight you know you can spin this up in in five minutes um evidence is
9:42open source um so you can you can just go to um the evidence GitHub repo so
9:50evidence evidence um you can you know start from there run four commands in your in your terminal and you have evidence running on your computer yes so someone asked it it's a good question it's like mermaid I don't know if you know I think it's a shing yeah libr so I think what important to mention here is that the website like
10:11the framework is based on on javascripts right yeah so um and there's not something that is super important if you're a user um you know you don't have to write any JavaScript to use evidence um you just writing mark down um sort of similar to you know syntax you'd use to you know on slack or you know if you're
10:30writing in GitHub you do like a hyen and a space and it turns into a bullet point that kind of things um so mermaid as as
10:38I understand it um mermaid JS is the sort of tool that you use use to um uh build kind of flowcharts um in the browser um so in it shares some similarities to merma so it is like a JavaScript implementation there's a lot more flexibility in terms of you know the types of charts you can use um
10:59evidence so um there's uh you know a whole bunch I think there's a on the dot side a bit further down there's a big gallary um where you can see all of the different chart types and we can maybe take a look at that a little later but you know bar charts line charts Maps yeah I can I can I can actually go and
11:18if uh here in the in the components uh
11:23you have a page I think where you have all components yeah so exactly tables line charts area charts bar charts there's a whole there's a whole bunch of of stuff here so and we've really focused on uh the charts that we think are kind of most useful for for businesses um yeah and on top of that kind of you know giving you powerful
11:43tools to annotate and and like interpret that data um yeah I think what's important to to uh to mention is that there is like a tons of like um JavaScript charting libraries
11:59right um and if you want to build like an app a data app with you know dedicated for the web so you probably into the J JavaScript word right you need to pick up uh you know the the JavaScript um chart library and then you need to bind into your data so you need to find either the connector or the
12:21library to connect to your source and uh and then after you know writing uh some query and fetching to that where evidence abstract this for you so going from zero to one is is much faster right there is existing component and I think we have um sorry um we have uh also this is what I wanted to show for example an existing coment
12:51here is that you define your data we're going to see it uh later in the in the demo um and that's it that's that's a SQL query result basically right yeah so I I think that's an important point so JavaScript um plotting libraries so there's you know High charts echarts um you know uh VJs
13:15there's there's just like so many of them um but they are T typically uh typically pretty pretty easy to get started with but pretty difficult to make a chart that looks really nice you know that like the one that comes out of the box if you're using any of them like plotly whatever you it's sort of it is a
13:33chart but then you're going to spend you know 10 15 minutes writing different configuration to get that chart look looking like good enough to use um which is really boring and then every time you have to make a chart then you're rewriting all of that configuration so evidence is using actually echarts under the hood but we've done a lot of the
13:51configuration so that they look good by default but then you can always change the configuration you know if you want to change the colors or you know how the labels look on a or you know pretty much anything about it within the echart stack um you you can uh you can you can you can modify from there and so see if we if we zoom out a
14:12bit and now that we have a you know high level understanding of what uh evidence is you define yourself as you know bi ice code um and I think that's a pretty new uh recent movement and when I mean recent what is like three three years is
14:32like this maybe I'm wrong but this like at least uh as working for the past 10 years that's the first time I kind of like heard that concept uh what does that concept brings versus like standard dashboarding tools that people use commonly in bi yeah three years is interesting I know evidence been three years I wouldn't say everyone knew about tools
14:56like evidence um three years ago I mean obviously there's a lot of people who still discovering the idea of of B code um but but the idea here is you are rather than you know a typical VI workflow for tools that people are the the most the most popular and most used tools at the moment you know power VI
15:16tblo um looker those kind of things is you are once you have your data in your data warehouse and you know done all of the transformation to get it you know really clean then the final step is someone building reports um in in the tool and it'll be some kind of DUI tool and they'll be dragging and dropping you
15:35know kind of like a pivot table interface you know drag the this Dimension onto the x axis and this Dimension onto the Y this uh um measure
15:45onto the y- axis and I'm going to you know split by this other dimension um which you know is isn't a terrible workflow for creating dashboards um but where it gets really really rough is that um maintain dashboards that have been built like that is is really really difficult um because you know every time you want to
16:06change something across like you know you end up with an accumulation of 50 dashboards you end up having to like go through each of your 50 dashboards and manually like do a drag and drop workload to update it and you know it can be quite error prone and that kind of thing so and and and you know if
16:22you're running a business 90% of your time as a someone working on you know bi and data presentation is actually in the maintenance side of it it's not in the creation creation is is fast but then your stakeholders are just saying oh well this you know there's something wrong with this metric or we need to update this to account for this new
16:39thing that was at least my experience was you're spending a lot of time like tinkering and it's so frustrating when you have to do like loads of manual work to do all that tinkering and you can't test it before you know you release it so yeah that's kind of where bi's code comes in is like you know you're writing
16:56when when you're writing code to Define what your dashboards look like it's much easier um to um to to to update them you know you can just use sort of contrl f to find all instances of the thing you want to update you make the edit you can then run a local copy of it on your machine
17:13to check it you know looks correct um and then you run a deployment process you can run that on a QA Branch so you can deploy it onto your QA maybe you have some test data that you're deploying it with um and you can check
17:28correctly before you finally then you know merge that change into your main code branch which then like deploys a new version of your site so this sort of ideas of having Version Control and having cicd and testing um really allow you to have a much more robust but also sort of enjoyable way of maintaining your dashboards and that's kind of what I
17:52think the magic of evidence is that it's like it's it is there's something really fun and we'll do some local development in a little bit but like local Dev is amazing it's so fast it's Snappy you get like instant responsiveness and you just have feel like so much more in control yeah no but that's a pretty good
18:09summary and I think to bounce back I think in general in data like the local the development workflow have been broken in the sense that we rely on server properity tool where it's really difficult to kind of like mimic what you want to do locally right if I compare to building a website you write some HTML and you get the result right after right
18:34in data you have a dependency with data which makes hard but a lot of like uh historical existing bi tool know has been relying on server and so it's just like kind of difficult to kind of mimic things uh local and then push it to different environment and I think for a different environment um I'm curious your thoughts about that we can ask the
18:58audience um who is like you know dreaming about testing their Beyond environment in you know in different environments so mainly development staging and production so meaning like if you uh change something on a critical dashboard for example you can vot it out into a different environment that people can try before running into production because in my opinion this is U kind of a new
19:25education that's bring to those people where where it's it's it's mostly like software engineering best practice right and those hasn't been applied in my sense on the bi world but guess what like every Matrix every chart that you're are seeing on the on your dashboard is a software asset so you need to you know treat it as a seor
19:49development process so uh you know put some testing using versioning system like git uh to to have peer review being
19:59able to deploy it on multiple Branch but I me yeah I'm curious based on your users like who is kind of saying oh you
20:08know uh I wish I had that I I always dreamed to have that or other people are more oh I didn't know we could do that and we should do that right so what's what's the balance of kind of user you're seeing there well I am obviously we don't talk to to everyone that uses a which is one
20:29of the fun things about running like an open source company is just anyone can download it and try it um but I talk to kind of two two types of people um there's people who are um familiar with using tools like git um and uh Version Control and GitHub and you know have written like writing SQL is like a major
20:52part of that job and often they you know they'll have been using some other bi tool or Vis tool and when they discover evidence they are just generally so excited to try it because you know it's the first tool that has really taken you know the SQL writing experience and added like a a Version Control layer on
21:12top of that and you know a a way to then you know on top of that like also visualize the data in a way that's really presentable you know you can Version Control you know scripts and store store procedures in your database but but that you can't actually then use them to like deliver data to people so
21:30that that that I think is what really kind of unique um and uh and then the other thing that um the other kind of person that we often encounter is someone who's um you know been using maybe Tableau or something they're not not as um familiar with um with you know get or working in the command line or working locally and
21:51things like that but they're often pretty excited about just how How Deeply customizable evidence is you you can get you start to feel quite constrained in most bi tools um because you can't change a lot you know you're always going to have like a grid of charts or like you know you're going to have to use sort of specific you know you can
22:09always tell that a tableau chart is a tableau chart you know they have that specific look and feel um and you know you have a lot more flexibility with with something like evidence um but for those users it's typically a bit more of a learning curve because they have to you know download git for the first time
22:25and and uh learn what it means to like you know source control your code um yeah so yeah those I guess is is is my experiences those two kinds of people so there is kind of like uh tradeoff in the learning curve for people a bit less technical that is kind of like worth it and I think a good analogy is DBT uh DBT
22:46was you know is a framework to run a production SQL Pipeline and usually people from business used to do you know copy paste into their UI or have a schedule button and they they didn't know about gate or terminal and DBT can bring you know that back of those software engineering uh Foundation to them um and I can uh I can
23:12totally relate to what you're saying Archie that uh they are excited they they see that it was actually painful that's not the norm of copy pasting a SQL query under under a Google doc with like you know final V2 uh production query um but they used to do that and I think that's that's the fun part is that
23:32they don't know what they don't know right and then when they discover that there is this principle um which actually are just softare engineering principle but apply to to bi that's that's really nice uh I want to cut off on one point maybe you can show off that you were talking about flexibility can you do you have a couple of example that
23:53we maybe can can see uh I think you shared with me before for but I want to basically see how much of customization you you can go uh today and maybe maybe you can talk about uh you know what it could be in the future let me actually or you want to go ahead and share but I
24:15have the example page yeah okay that's where I was gonna go as well so you it so yeah um so we
24:24just have a gallery of different uh examples of things you can do on evidence but um maybe let's go into um I think um maybe let's pick like a
24:34pretty vanilla evidence project to begin with so I think there's one called um uh Northstar report um I think you
24:43can see that one this is a like a reasonably um standard you know
24:58oh I just lost you for a second maybe it was me yeah and it's okay
25:05yeah so you have um here like a just like a chart up front with a symetrics but you know you can also see you can add like Target lines and um configure zones that you want to like draw people's attention to um you can also add stuff like a lot of commentary around you know why we're doing stuff
25:24and uh I think someone mentioned mermaid before there so this is a mermaid chart you can uh you can put mermaid CH in evidence as well uh so you just like invoke a mermaid component and tell it what you want to look like but but this is really powerful for like explaining to people you know what
25:43why um you know why this report what um
25:47what's it showing me which is often a question people have when they're trying to use your like internal data tools typically yeah like the the if you if you're running like a a data tool in in a business like there's always a slack Channel associated with it which is like constantly people asking questions about your data you know it's like what does
26:05this mean like where does this metric come from uh is this correct like why is this broken it's just like a constant stream of of requests and you can you know as a company as a data team you can write documentation but no one will ever read your documentation is what I find no so to have them actually read it you
26:23have to inline the information into the app um that's a really that's a really great point I think like storytelling I think just dashboarding in general or creating uh you know things uh that is readable in a chart is is already an art right like which kind of colors you're going to go um and you know which kind
26:46of access Dimension you're going to use but the other thing is like that people are completely underestimated even if they have that it's exactly as you said it's like how do you in line definition and information uh because indeed like having separate definition of concept is uh it's pretty hard to maintain and nobody's going to read it uh but having
27:09kind of a story over there um and actually the the best dashboard I ever seen in presentation in my career and I've worked with a lot of bi tools is was notebooks like Jupiter notebooks like hadock reporting from a data scientist that say yeah I'm going to work on a specific like campaign or whatsoever and it has a clear story
27:31because there is those inline information right a notebook kind of like word truth is uh vertical right and I feel like yeah dashboard used to be more like horizontal or you know but as much as you can on the orang of do fix um so yeah that's that's a really um great great point do you have another
27:55example you want to to show yeah why don't you um if you hop back to the examples page um there's uh there's one that's a Community member did about um the the like Titanic um survivors and and and people who were on the the Titanic which I think is a pretty interesting one um and this one highlights kind of like some of the
28:18customization that you can end up doing um so this one I think is um loading
28:24some data from um like a data set and
28:29it's just analyzing how um you know who the passengers were and you know whether they survived and you know what what they look like based on that um but you know you can see some interesting customizations this person's done they they've changed the color scheme um they've also removed the sidebar uh they've changed the logo on the top of
28:47the the header so you know like I think I can still pretty much tell this is an Evidence app by looking at it I recognize enough of like the charts Styles and things but it's definitely getting a little bit more custom um and then I've seen some people do some pretty wild stuff as
29:07well I've seen people do um the the fc24
29:11one which is like some kind of football stats website um it's just next to the Titanic one this one's kind of in my mind pretty mad uh like so this one is uh an Evidence project as well but like at this point I would say you know it's kind of just like a web app like data web app and
29:33things so you you know they got some data in it but then they've got you know all these like player stats and this heat map it's pretty pretty amazing um in lining all this information and I think I think you can drill into some of these I don't know exactly but um yeah
29:49pretty cool um like this is not something you could achieve in like your average bi tool so this is kind of like the top end in terms of like customization you end up doing yeah no that's that's um that's great um to to see I think that's um but
30:08aside from the customization I think it's nice that you have um kind of like
30:15opiated ready component right because I think like as I said um it is pretty hard so to um to just find like what's
30:26the right color to have something you know valuable from from the first from the first go right so I think it's it's also a balance maybe you've you've been thinking internally it's like how much customization we we want to offer it's like no no you don't want customization otherwise it's gonna be a huge mess right finding this right on how much
30:49template you want to give it it's an interesting point I think um the good thing about um having your um defining your like reports in code is that you don't have to show the user all of the options like all of the time um if they don't want to enable them so what what we strive for is like it looks
31:11great out of the box with no customization and then if you want to you can just add customization properties to it so you know you start with a basic you know bar chart and then you're like okay actually I want to change the colors so I'm going to do fill color equals blue and then I want to change this other thing so unlike if
31:28you're if you're doing like a more of a like GUI driven experience like if you the user wants to find like a specific thing to edit they're going to have like a long menu with like a hundred different things potentially and so at that point it makes sense to cut down the number of things you can you can
31:44customize because otherwise you could never present them all to the user um whereas you know in a code-based experience it's it's just additional properties that the user can add themselves so so we basically allow you to change pretty much anything about your charts um because it doesn't really add burden to the person looking at the code or the person writing the
32:04code so we haven't speak about how is it how Doug DB you know is fitting evidence ecosystem maybe you can tell us about that because we have obviously it's on the mother deck Channel a lot of people familiar and loving du DB so how how are you using Doug Deb with in evidence yeah there's um let me share my
32:28uh screen for a second
32:33so so uh let me go to
32:41here so the um evidence uses duct DV in kind of two different ways um so the first is evidence um supports duct DV is a data source so you know evidence has a lot of different connectors you can connect pretty much any major um data warehouse to evidence you know bigquery post grasp um Amazon uh s Etc and duck DB is also um
33:08an option there so you know if you have a duck DB file or CSV files or par files
33:14which also can be used by duv you can include them in your project as as data sources um so and that that's nice you can use you know Dub's really permissive SQL syntax and to write queries there and you know it's pretty compressed and you can publish it you know alongside your app so you don't need to you know
33:32if you don't want to you don't want to add your credentials you can do that um but that's kind of not the most interesting way that um evidence uses de Tob um the most interesting way is actually inside the the markdown Pages itself so because um you know if you're
33:50writing SQL normally you have the option you know you could write SQL against your bigquery database and you could also write SQL against your postgress database but then you wouldn't be able to like join those two data sets together using SQL because they use different SQ syntaxes and because underneath there's you know two different databases so evidence actually ships
34:12with a duct DB engine um in the browser
34:17so we're using duck DB's web web assembly uh package um which is a really cool project um and what that does is it means you have a database engine um running inside inside uh inside your browsers you know in Chrome or ar or whatever you're using um and you can then query run execute new queries on the Fly um inside the browser and what
34:41that means is if you've got you know five different data sources you can be joining those data sources in the browser you can be creating interactivity in the browser so you can make little inputs you know drop downs that user select yeah and when they when they do a filter action the DB in the browser is going to run that filter
34:59action because it's in the browser it's just lightning fast you know you're not waiting to communicate with the database to do that and that that's really quite different to to how most VI tools would work in that situation yeah I think that there is like to be fair bi tools implementing cache mechanism um but is it is like uh you
35:20know propriatary kind of like technology which has some limitation so I think the opportunity there is that you know the caching can be done from any kind of source right um and ddb has
35:37ddb has just this this open file format that you can also leverage when when you want to work uh to to work locally do you have what's your um what's your thoughts about Dub wasman where where do you see uh it could fit in evidence if you have any work worked on on that sorry question asking me yeah yeah
36:05so where where do you see ddb wasm um and maybe like explaining quickly how what what's the difference between um you know D DB1 because you're using the node module right as I believe so we're doing we're doing we're using two different parts so in when duct DB is a data source then we're using um just normal uh du DB the node
36:29the node package um yeah and that and that's going to run it you know in uh on the server when um when someone is like trying to run queries against their duct DD and it's going to create them like basically a cache that they can query but the um when we're combining sources in the browser we're running the the WM
36:48version of duct DD okay I see so we're actually using using both yeah that I wasn't sure so that's why I was asking okay um so yeah so Doug DB was is so that's the power maybe we can I just to to explain to the audience that so Doug DB can run in process in you know no GS
37:09or uh python goang but it can also run in the browser to web assembly so web assembly is a technology where you can run you know a lowlevel code for high performance like C+ Plus or R um and it's a it's kind of a container and it's run in your browser so there is no uh external dependency and so that's that
37:32that's also the beauty because then low latency uh and everything is on the client and now I think the next step is H if you work with Moder deck um so modu
37:44is also working with on a was MDK right
37:48um and so you can have your data on the server and still on the and cash on the client so that's what a lot of company are starting to do is basically be able to easily hydrate new data on the client and then I avoid any compute on the server because here we are talking about you know develop and
38:08workflow when you run locally but what happens when you you know um need to basically share your uh uh share your uh
38:18your dashboard have you seen people what kind of part did you see some patterns uh using Doug DB and sharing like do they put it on three or did you see other patterns than using mod deck to see working on the server side um so I've seen um are you talking
38:38about where people are deploying the websites or where people are put hosting their duct DB I think it's it's an interesting point maybe we should talk first uh on like where where we haven't mentioned really I think quickly so we can mention maybe you can explain like what's the next step after you you know you have your your evidence dashboard working
38:59locally to deploy it and then where do you host the database yeah yeah so um evidence um is a web app
39:09that builds at the end of the process it builds a static website so well that means is you you know you finished developing and you've um written all your code and then you can run it on your computer you can run a build so you run you know mpm run build or if you have you know a deployment service
39:25that's going to be like running this it's going to be running this for you when you merge your code and what that does is it compiles all the markdown that you've written uh and turns it into HTML and JavaScript and CSS and you you know you don't really need to know about that but that's kind of what it's what
39:40it's building it's building this this like vanilla website um and it's a static website which means that it doesn't have a server at all um the you
39:52know you you host it on some kind of server platform you know somewhere where you can host those files um and that's all you need so there's no live connection to the database that you're you know or the databases that you're running with when the app is running it's just um connected to uh the file server where you're hosting these
40:10like HTML JavaScript CSS files yeah so that can run on where where could inrun pra to make like you know verel ntif fly
40:21or evidence that's what you that's what you're offering right yeah so um so
40:28evidence has like a a paid product which is evidence Cloud which is probably the easiest way to host your evidence website but because it's open source you can self host forell and netfi are good options for you more sort of like hobby projects but you know if you're deploying inside your business you might host it on AWS or gcp or um you know
40:48some other platform like that um and that allows allows you to kind of mean that the data if if you're running a business you often have like requirements in your data sovereignty you don't want it to ever leave your servers um and at that point uh you know you might you might run it on um you know Azure or something like that um so
41:07it's really flexible you can you can deploy it anywhere um and you know having having that s someone's just asked a question um maybe you can and pop it up which I is a really good question of course um from yes around so you need to rebuild and deploy anytime you want to refresh your dashboard so you um you kind of have two options
41:32here um you whenever you whenever you merge a change to um your code uh then
41:39it will automatically rebuild and redeploy your dashboard so you if you merge it to your main branch of your of your code then it will um but then the the other thing you can do is you can set up a schedule to just refresh your data for you so you know you can set that up to be as regular as you want um
41:56you know many businesses or companies will run it like once a day you know once their you know uh data pipelines are finished running in the morning then they'll update then um but I've seen people doing it you know once an hour um once every five minutes so you can you can get pretty low latency uh like that
42:13and and those those are just um you know running rebuilds and because it's a you know website those builds can be re reasonably fast and which means you can get to pretty low latency like that but yeah you you need to you need to rebuild to update the data in short yeah so that's a that's a good
42:33point we are running um already uh on time so let's let's go dive a bit into the code so um maybe you could go on a live server to show a bit uh the interactivity and what we have been discussed of this you know JavaScript web server running uh head SQL queries
42:54uh and having those components and having the view so I'll add to to yeah
43:01stage here so okay so um I've got this
43:06um evidence project running on my local computer um and let me kind of you know this is um this project is actually a a dive into San Francisco's um 311 tickets that are filed so this you know you've got the emergency number 911 and then if you just have like a service request um you know maybe like there's some graffi
43:27or something like that um then you can file with 311 instead um so we're just exploring some of that data um and I've
43:36got here um this is running on my local computer so it's on Local Host and on the this side I've got my you know my markdown file which is turning into evidence so if I make changes to my my markdown file then it will change the um the app as I'm developing it um which is cool um
43:57and you know we talking a little bit about um you know how evidence works so the first thing you have is is SQL queries so um you know I've got this SQL query here maybe let me format it to make it a bit easier to read it's just um selecting uh the first date in the range the last date in the range and the
44:16total number of cases um and that is you
44:20know running SQL against my my data source that I have which is you know ADB in this case um and then once I've got that data I'm then able to use it so um
44:33I've got my query this is called description so I can see it here um and then I can see the data that it's returning from the database and you know if I wanted to change this um you know I wanted to do like uh some other operation here it would then automatically rerun here I'll do one of
44:50those in a little bit and then below that I'm using components so uh here here I have a component um which is this value component and then uh I'm passing it the query called description uh and I'm asking it for the count of the description and then I'm also determining you know what the um what the number format is here so all of
45:14these things are then being used to look and say oh this data includes 100,000 rows and that's the that's the number that I'm pulling here and if I want to make a change maybe I wanted to show explicitly rather than with 100K I've changed the number format there um and yeah because this is um you know markdown it's not not really Whit
45:38space sensitive um for these components so I can that's why I was just rearranging that just to make it easier to read so you got and just one thing I want to point out is that you do have the SQL query the original SQ query and I think that's that's also something that's missing in standard bi tool
45:57because it's a bit less like track and drop so there is SQL query behind the scene which is happening but you don't see them and so if you want to track kind of a lineage things or saying how this this thing was calculated it is super hard we've dedicates uh you know uh bi2 where it's a interface and then there is an
46:18abstraction layer where here it's it's just those squal queries so you can directly inspect them yeah that's that's a good point so um every query every bit of data that's on the page has to have a query associated with it so you know I'm looking at this trend data obviously I can like download the data if I want or
46:37have a look at data but more interesting is like okay where did this data from like which which data which table was it from you know which columns what query did I run to get it I'm curious for the for the download download data do you did you did you develop that because people were asking if they can have it
46:57in Excel yeah 100% I mean uh you know you people are always
47:04you know you you should bring bring the data to where people are at and uh you know in businesses it's Google Sheets it's Excel so you've always got to be able to have that button which is like oh I just what if I want to get this into Excel so uh you know you can uh you get in a CSD open Excel they can uh they
47:22can get going um yeah but let me let me a couple
47:27of other features um of how an Evidence project works so I've just had this main index page open here so this this sort of 60 lines of code or so 80 lines of code um defin this page um I've got a few different things here I've got this Dimension grid I've got which is this thing here um and I've got this calendar
47:46heat map which is displaying the uh the data just like on a daily basis so I can see how many cases there were per day um but uh I've got some other pages in The as well um and I've got some other things that I'd use to you know to create this so let me just show you a
48:01little bit more about how an Evidence project looks so an Evidence project it's it's um it's a node app which means you know a Javascript app that has um uh that can run JavaScript on the server um and you know these these two files are basically just um uh logging the dependencies for the Javascript app and then the
48:24interesting stuff is basically happening in um pages and sources so sources is
48:30where you define any data sources that you want to connect to evidence so I said you could create like a you know duct DV or big query or snowflake each one of these you add as a new source in this sources folder and here I'm using du DB so this is a du DB file uh and then I have a configuration file which
48:50is telling telling me um you know uh what which file to point at and then I have a SQL query and this is just a duct DB query and it's saying okay I want to go get all of the data from this SF 311 table um and the fact that it's called
49:08cases um and the source is called sf311
49:12that means that's why when I have my SQL queries I'm running it from sf311 do cases so it's it's turning all of these queries that I have in my sources into tables that I can access in my markdown um but I don't just have this inex page you know I've got some other Pages um I've got a page for each of the
49:33neighborhoods um in San Francisco as well um and that's defined by this neighborhoods folder and then the index page the neighborhoods um this is pretty interesting so I can you know go look at uh Mission and uh see see what things look like in there or you know out of Sunset and I can just look at the cases
49:54there and you know what um you know what category of things are happening there um yeah so that's uh go
50:03question yeah we have a we have an interesting question here for many uh bi tools uh EMB bidding
50:12reports into web application can be costly into a s style application and I think we have kind of a related uh question which is uh this one uh are we
50:24able to start for and then build to in additional features like a standard I mean dashboard so I think what's happening and I've used that in my experience is that you have a SAS application whatever you have offering let's say you're LinkedIn and you have an analytics page there right and usually they contact the data teams and
50:45the data teams said yeah we have our bi tools you know which are propriatary and so on but and so you have basically the product engering team that need to build custom shop charts right to kind of like push back what is used for decision for bi right internally to the to the internal product let's say uh if you're
51:07if you uh if you are LinkedIn and you want to provide analytics um to other users so how do you see maybe you can first ask the the the the the easy question right this one and then I'm curious to to to hear your take on how do you see basically people pushing back you know operational analytics that I
51:30like to to say yeah so the easy question you're saying on embedding how you embed evidence yeah I think the easy question is like on regarding the fees like the one I I I ped like here right now sorry
51:44I need to have a look at streamyard again yeah yeah um so okay so uh the so
51:51if there is if if the cost you know is per VI Okay so okay so there are a couple of different ways to embed evidence um you can have uh evidence you can just embed evidence as an high frame um in your app um so you have um some kind of like authentication layer that you know is
52:15getting people into your app you can um build evidence websites and embed inside that app um and you know you can remove like the sidebar and the the nav bar and the various like different bits and pieces so you just have like the core like chart layout um and that embeds really well in in apps so you know and
52:36in terms of like hosting Arrangements like you can host you can self host that and deploy that on your own infrastructure um you can uh deploy it on evidence cloud like th those kind of things are like you kind of have like your choice there um so um if you want
52:53to embed evidence dashboards and cell poost you're not going to pay pay for anything um on evidence Cloud there there's like an embedded option as well um and I think that's like quite dependent on like how many customers or like businesses or end users you you want to like have look at stuff um but it it's certainly less expensive than
53:13the like tablos and the powerbi of the world like you can you can be spending like half a million or a million dollars on on embeded Solutions those products it's it's uh they're really expensive for sure yeah um and so what's what's your so you actually aler my other question was like how do you do people inid it uh
53:36evidences of like mostly through iframe right that's what what what you see yeah um and then you know you can also pass if you need to like pass information from your authentication system into the iframe to determine like which report you're selling that that kind of thing so you know I um if I have like you know 10 different customers um
54:02I can read in from my authentication system and then pass that token into the iframe and then it will fetch the right report for that customer and so that that's kind of the like the setup if you're trying to do like a multi multi-tenant deployment of evidence
54:28I think I've uh lost your audio M maybe maybe you're on me I'm not sure yeah sorry uh sorry I was top for for a sec um so first off the product looks really great we have another uh question here uh knowing money bi tools and bided reporting can be oh actually it's uh kind of a duplicate sorry it's LinkedIn
54:48being uh LinkedIn so we already answer that um what I want to di for uh closing is another example uh that um I've also
55:01uh work with evidence and Doug DB and mod duck uh is duck DB stats.com um and so you have uh actually let me share you
55:13directly in the repo um so it is an end to end data engineering project and you can actually uh actually if you go to the stats you have the the source code link uh here uh that's actually a duplicate uh but what is great with that is that um basically it's get PPI stats data from dug DP and
55:40you can actually if you have a python Library can actually clone the project and just change one variable and start U you know feeding data on another python Library if you want um it's using evidence and it's using mod deck basically to hydrate the data so whenever there is new data you know happening uh to mother duck there is a
56:03job as Arie saying on the evidence Cloud which is running once a day and uh getting refresh and what is actually nice I found here we were talking about you know the story Etc is that in uh in
56:17moddu you have this uh mechanism of sharing you know data set and whenever you have wdb client and so the DCT the the share that I said is here I actually uh just uh Chang it recently so it might
56:33be a fail live demo I didn't plan to do live demo um but so here is just to show you uh I have you know a local duct TB uh CLI and I'm G to connect uh to mod
56:47duck with just doing attach I have a token to authenticate to me to uh to the cloud and now if I show all the database I have all the database actually uh here and uh for example I think I have already uh share it it's already here but I can attach it secondly so no that's the share link right which is on the ddb
57:14stats website and now I can query the the data I can start I have all the data row data which is feeding uh the the report so I think you have um PPI daily stats and so this is the final table and so you can do you can also run describe uh here um and you see you can
57:41have a look directly at the row data that someone has shared and I think that's that's also the beauty of it I I wish like if like everything related to
57:52open data sets right that you have a dashboard and you can that much quickly you know query the row data set be amazing yeah that would be amazing instead of having a Google somewhere and yeah the cool thing about that Med is that anyone could have done that right because obviously you're logged in as you and you made the data set but I
58:15could log in and and and do the same exact this data set is sure yeah this that's a really good point this data set is sure and it's public uh so I invite you to play around uh also to to look at the website we're closing in um I think we had a lot a couple of quite uh nice
58:34question is there anything uh you want to add on the future of evidence like what's what are you hype for what's the next next thing that you are looking into in term of features I think um I'm really excited about kind of quite a lot of things but um the uh thing that we're really interested in is like how you can build
58:57like really amazing interactive experiences so one of the things that kind of can be really cool um when you're building like data apps is having like because you can explore the data the end user can explore the data really on their own building more capability around you know drilling into different things you click this part of a chart
59:17and then you pull up this data that expands into it all that kind of stuff is is stuff we're working on um right now and is is kind of really interesting to me I think um it just makes it so much more of a um you know so much more of an intuitive experience to try and understand what's going on in your
59:34organization or business you know it's what it's how people are used to experiencing um you know apps now it's just like you know with instant client interactivity so so I think that that'll be that'll be really great and we've got a lot of releases coming out with that I think in the next couple of months cool uh on that you are right in
59:53time for a stand up for people I guess in the west coast maybe uh but uh yeah that that was amazing thank you so much Archie for uh coming in and I think we'll see you yeah we'll see you around um probably maybe uh later let's let's let's see each other in in six months like where where do you stand and
60:16where mod stand because I think it's also nice right Doug DB is evolving a lot mod deck is evolving a lot so I think it's also give opportunity to different product in the ecosystem like like you uh folks uh to see how how things are being evolved um quacking code is every month uh the live will be on YouTube
60:38afterwards uh you can check mod duck uh TV channel uh on YouTube and yeah have a
60:45great day a great stand up wherever you are or a great evening all right
FAQS
What is Evidence and how does it use SQL and Markdown for data visualization?
Evidence is an open-source tool for building data apps using Markdown and SQL. You connect your database, write SQL queries, and embed results into Markdown pages using built-in chart components (bar charts, line charts, maps, tables, etc.). The output is a website or web app that can be deployed as a static site. Evidence provides opinionated, good-looking defaults out of the box while allowing deep customization when needed. Learn more about the Evidence integration.
How does Evidence use DuckDB in the browser for interactive dashboards?
Evidence uses DuckDB in two ways: as a data source connector (reading DuckDB files, CSVs, or Parquet files) and as an in-browser engine via DuckDB's WebAssembly package. The in-browser DuckDB engine lets users run filter actions, dropdowns, and interactive queries with very fast response times since no server round-trip is required. This also allows joining data from multiple different data sources directly in the browser. For more on this technology, see our guide on DuckDB WASM for web data exploration.
What are the advantages of BI-as-code tools like Evidence over traditional dashboarding?
BI-as-code tools let you define dashboards in code (SQL + Markdown), which enables version control with Git, CI/CD workflows, testing on QA branches before deploying to production, and easy bulk updates across dashboards using find-and-replace. Traditional drag-and-drop BI tools make creation fast but maintenance painful. Updating 50 dashboards one by one is error-prone and tedious. Code-based dashboards also support inline commentary and documentation, which helps stakeholders understand the data without consulting a separate wiki.
How do you deploy and refresh an Evidence dashboard?
Evidence builds a static website from your Markdown and SQL code. You can host it on platforms like Vercel, Netlify, AWS, GCP, Azure, or Evidence Cloud. Since there is no live database connection in the deployed app, you refresh data by rebuilding the site, either on a schedule (hourly, daily, etc.) or whenever you merge code changes to your main branch. The rebuild process re-runs all SQL queries, caches results, and produces updated HTML, JavaScript, and CSS files.
Related Videos
2026-01-21
The MCP Sessions - Vol 2: Supply Chain Analytics
Jacob and Alex from MotherDuck query data using the MotherDuck MCP. Watch as they analyze 180,000 rows of shipment data through conversational AI, uncovering late delivery patterns, profitability insights, and operational trends with no SQL required!
Stream
AI, ML and LLMs
MotherDuck Features
SQL
BI & Visualization
Tutorial
2026-01-13
The MCP Sessions Vol. 1: Sports Analytics
Watch us dive into NFL playoff odds and PGA Tour stats using using MotherDuck's MCP server with Claude. See how to analyze data, build visualizations, and iterate on insights in real-time using natural language queries and DuckDB.
AI, ML and LLMs
SQL
MotherDuck Features
Tutorial
BI & Visualization
Ecosystem

2025-11-19
LLMs Meet Data Warehouses: Reliable AI Agents for Business Analytics
LLMs excel at natural language understanding but struggle with factual accuracy when aggregating business data. Ryan Boyd explores the architectural patterns needed to make LLMs work effectively alongside analytics databases.
AI, ML and LLMs
MotherDuck Features
SQL
Talk
Python
BI & Visualization


