A new paradigm for data visualization with just SQL + Markdown

2024/09/24Featuring: ,

Traditional, GUI-based Business Intelligence (BI) tools are excellent for creating initial dashboards, but their "drag-and-drop" workflows often create significant long-term challenges. As data assets grow, maintenance becomes a bottleneck, version control is nearly impossible, and customization is limited, leading to a world where data professionals can spend up to 90% of their time on maintenance rather than new analysis. A modern approach, known as "BI as code," addresses these challenges by applying software engineering principles to analytics. This article explores how to use Evidence.dev, an open-source framework, and DuckDB to build maintainable, high-performance data apps using just SQL and Markdown. This combination frees data teams from tedious maintenance and unlocks deeper customization, transforming brittle dashboards into robust data applications.

Why Traditional BI Workflows Break Down at Scale

The core issue with many BI tools is that the final output, a dashboard, is a software asset that is not treated like one. The creation process involves manual clicks and configurations within a proprietary user interface, which is not only tedious but also error-prone. When a metric definition changes or a new data source is added, an analyst must manually update every affected report. A code-based workflow fundamentally changes this dynamic. By defining reports and visualizations in code, teams can track every change in Git, providing a complete history and the ability to revert to previous versions. New logic can be submitted through pull requests for peer review, and teams can integrate automated testing into a CI/CD pipeline to ensure updates do not break existing reports. This approach makes it simple to find and replace logic across an entire project, drastically reducing maintenance time and bringing the rigor of production software to the world of analytics.

Building Data Apps with the Tools You Already Know: SQL and Markdown

Evidence is an open-source framework designed to abstract away the complexities of web development, allowing data practitioners to focus on their core competencies. Instead of requiring knowledge of JavaScript, HTML, and CSS, Evidence enables users to build sophisticated data apps using only SQL queries and Markdown files. The workflow is straightforward. An analyst first defines their data sources, such as databases like DuckDB, Postgres, or modern cloud data warehouses like MotherDuck and Snowflake. Next, they write SQL queries to fetch and shape the data, which can be stored in dedicated .sql files or embedded directly within pages. Finally, they compose pages using Markdown syntax, embedding pre-built Evidence components like charts and tables that reference the SQL query results. This approach dramatically lowers the barrier to entry for creating custom, narrative-driven reports. This simple, code-based workflow also enables a highly efficient local development experience, allowing analysts to iterate on their work at speed.

Achieving Flow State: Fast, Local Development for Analytics

One of the most powerful aspects of the "BI as code" workflow is the local development experience. Traditional BI often requires a slow, server-dependent feedback loop. With Evidence, developers run a local server that provides instant updates as they write code. This tight feedback loop is a core feature of the development process. For instance, when a developer modifies a Markdown file to change the number format on a value component, the web page refreshes instantly to reflect the change. This immediate responsiveness allows developers to stay in a state of flow, making development faster, more enjoyable, and more productive. The project structure is organized for clarity, with a sources directory for data source configurations and a pages directory for the Markdown files that define the application's content. This clean separation of concerns makes projects easy to navigate and maintain.

Powering High-Performance Analytics with DuckDB

DuckDB plays two critical and distinct roles within the Evidence ecosystem, enabling both high-performance data processing and rich client-side interactivity.

First, Evidence supports DuckDB as a primary server-side data source. Users can connect their projects to local .duckdb database files or read directly from collections of Parquet and CSV files. During the build process, Evidence runs queries against this DuckDB source to fetch the data needed for the static site.

Second, and more innovatively, Evidence ships a DuckDB engine to the browser using WebAssembly (WASM). This in-browser OLAP engine unlocks powerful capabilities without requiring a round trip to a server. When a user interacts with a filter or a dropdown on a page, the query is executed by DuckDB WASM directly on the client's machine, providing lightning-fast responses. This architecture also makes it possible to perform joins and aggregations across data that originated from completely different sources, for instance, combining data from a Postgres database and a BigQuery table on the fly.

Designing for Insight: Customization and Data Storytelling

A code-based approach unlocks a level of customization and narrative depth that is difficult to achieve in GUI-based tools. Because reports are just text files, it becomes simple to inline context, definitions, and commentary directly alongside data visualizations, solving the common problem of documentation living in a separate location. This flexibility allows for a wide range of applications. For example, a "North Star Report" can be enhanced by adding target lines, colored zones, and detailed explanations to charts, creating a clear narrative for stakeholders. At the other end of the spectrum, a highly customized football statistics site built with Evidence can look less like a traditional dashboard and more like a bespoke data web application. Moving to code does not mean sacrificing visual polish; instead, it provides the control needed to build truly tailored experiences.

From Localhost to Production: Deploying and Sharing Your Data App

The deployment model for Evidence is another key differentiator. When a project is built, Evidence generates a static website consisting of HTML, CSS, and JavaScript files, which can be hosted on any modern serverless platform like Vercel, Netlify, or AWS S3. Because there is no active server or live database connection required to serve the application, this model is inherently secure, scalable, and cost-effective. A CI/CD pipeline handles data refreshes by triggering scheduled rebuilds, which can run daily after upstream data pipelines complete or every few minutes for lower latency.

A powerful pattern is exemplified by the DuckDBstats.com project, which combines Evidence with MotherDuck. The site is a static Evidence application that visualizes PyPI download statistics stored in MotherDuck. Crucially, the site also includes a MotherDuck share link, allowing users to not only view the curated report but also gain direct, queryable access to the underlying raw dataset using any DuckDB client. This approach powerfully combines polished data presentation with open data access, enabling deeper exploration for interested users.

The Next Generation of Business Intelligence

The shift toward "BI as code" represents a maturation of the analytics engineering field. By adopting principles from software development, data teams can build more reliable, scalable, and insightful data products. Tools like Evidence, powered by the performance and flexibility of DuckDB, are lowering the barrier to this new paradigm. The future of business intelligence is not just about visualizing data, but about building interactive, maintainable, and trustworthy data applications that organizations can depend on.

0:00oh [Music]

0:31he

0:37[Music] [Applause] [Music]

1:06[Music]

1:22he [Music]

1:50hello everybody and welcome to another episode of uh quack and code where we do discussion and also code around uh and in this episode I have uh someone I was wondering why is not been there yet on this live because I've been uh talking quite a lot about uh evidence which GNA be the the topic of today but just in

2:15general of bi as a code tool and how it's changing but uh if that doesn't really you know sounds familiar to you the bi as code tool don't worry we'll dive into that and so I have Arie from evidence which is there so hello hey so excited to be here yeah how how are you it's been uh I

2:39think it's been a while uh that we met in person I think the last time was at the DBT coals last year right yeah yeah

2:50I've been doing well um I I've actually had a had a small kid since then so um life has been uh pretty new uh sleeping a little less than than I might otherwise but otherwise it's been it's been really great but yeah I'm I'm really excited to join I feel like uh this has been a long time on the

3:07cards yeah cool um no so you you uh yeah that's endling everything uh parent uh parents live and uh dashboard live so be

3:19tell us a bit like I'm curious because I actually don't know the story myself uh on uh your evidence like really the starting story of evidence and how how did you come and and join uh join the company before diving into what it is exactly yeah so I um was actually um

3:41looking at starting my own business um I moved to Canada um which is where I live now maybe um about three years ago and um I was trying to start my own business uh in in kind of the data space at the time um but that didn't really go anywhere and when I was trying to work out what I wanted to do next um I was

3:58sort of investigating a few different ideas around particularly around you know data visualization and how you could do better data storytelling um and then I came across these two guys um Adam and sha who um I when I saw what they had done I was like ah this is part like this is what I should have been

4:17doing you know um so um I talked to them and you know uh one thing led to another and I uh I ended up working there but um I was just really impressed with their whole approach to you know know um they were taking to data visualization and and and bringing metrics and data to companies and I felt at least from my

4:38experience before so I used to run a a bi team at a e-commerce company um a lot of the things that I felt were missing um in the experience there you know the lack of customization the ability to of add commentary to um to the reports as you're making them to explain what's going on all that kind of stuff this is

5:00stuff that they were thinking about really really carefully with evidence um so I was really excited to Jo yeah and I'm care what's what's exactly your background you were talking that you were running a bi team uh are you like do you define more yourself on the business side because like you know evidence is kind of like more on the

5:21left software like more technical people right how do you define like your background and your persona in the data space yeah um I I mean I feel like I have a slightly unusual background but actually I think data is one of those spaces where lots of people come at it from all kinds of different angles so it's I'm

5:41kind of typical in the sense that I'm a typical um so I started my career as a Management Consultant so I was making lots of PowerPoint slides basically um you know and that's a quite a different um experience but you're still making a lot of charts and so from that experience I bring like quite strong opinions about

6:01like what makes a really clean understandable chart for like a business person um but then kind of the thing that I did after that was I moved into an e-commerce company called patch plants uh based in London UK and um I

6:17started there as as the chief of staff so not explicitly focusing on data but um was always having to like talk to the data team because we always trying to make decisions you know based on the data we had and often we didn't the data we needed to make the decisions and so part of what I was doing was like trying

6:34to lead the processes where we we'd bring that data into our into our stack into sort of snowflake look at that kind of thing um and we were big users of DVT

6:44there as well and I kind of in the end kind of was just getting closer and closer to that team um and then eventually we um ended up having a bit of like a reshuffle of our of our management team and I started running that team as well um because it was something I was really interested in I

7:00felt like I could add a lot with my experience um so it's a lot long I learned SQL while I was there and you know I got increasingly into you know using VI tools like looker um but I am not you know a data engineer or anything by training um any SQL and and python I picked up has has basically been on the

7:21side um and so I guess I'd consider

7:25myself sort of a a bit of like a data generalist um without without better word um yeah and you know since I've been at evidence you know I've kind of been continuing that Journey yeah it totally resonate with me

7:41saying that we in data people come from inusual background and not to flattering you but that's usually the best person I've met in the data space I don't know I feel like uh as a data person we are tied

7:59with different stakeholder different case would it be technical or business and I think this versatile background right not from typically computer science uh helps you I think understanding different perspective from uh different people and their different needs uh but let's di uh because already people are in the chat and they ask like what is evidence uh we have someone that

8:23say I ask oh if they heard of it and someone say yes in your previous bi has code video uh so that we did on modu channel so that's great uh we had a couple of people that haven't haven't heard of it and they are um and uh somebody else it is safe don't

8:42worry evidence. deev is a real company all right uh so let's let let's a bit dive in and what is uh evidence exactly maybe Archie you can can tell us yeah so

8:56evidence is um it's a it's a tool to make data apps uh using markdown and SQL so um you hook up your database uh and then you write uh SQL queries um and then you uh ingest those SQL queries in in sort of in markdown components and then the output you get is kind of a website or web app um and that's in

9:20essence kind of all it is and you know under the hood we have sort of some pretty interesting stuff going on but um it's perfect if you are um trying to build uh data reporting um but you're looking for something that's um you're looking for something that's pretty lightweight you know you can spin this up in in five minutes um evidence is

9:42open source um so you can you can just go to um the evidence GitHub repo so

9:50evidence evidence um you can you know start from there run four commands in your in your terminal and you have evidence running on your computer yes so someone asked it it's a good question it's like mermaid I don't know if you know I think it's a shing yeah libr so I think what important to mention here is that the website like

10:11the framework is based on on javascripts right yeah so um and there's not something that is super important if you're a user um you know you don't have to write any JavaScript to use evidence um you just writing mark down um sort of similar to you know syntax you'd use to you know on slack or you know if you're

10:30writing in GitHub you do like a hyen and a space and it turns into a bullet point that kind of things um so mermaid as as

10:38I understand it um mermaid JS is the sort of tool that you use use to um uh build kind of flowcharts um in the browser um so in it shares some similarities to merma so it is like a JavaScript implementation there's a lot more flexibility in terms of you know the types of charts you can use um

10:59evidence so um there's uh you know a whole bunch I think there's a on the dot side a bit further down there's a big gallary um where you can see all of the different chart types and we can maybe take a look at that a little later but you know bar charts line charts Maps yeah I can I can I can actually go and

11:18if uh here in the in the components uh

11:23you have a page I think where you have all components yeah so exactly tables line charts area charts bar charts there's a whole there's a whole bunch of of stuff here so and we've really focused on uh the charts that we think are kind of most useful for for businesses um yeah and on top of that kind of you know giving you powerful

11:43tools to annotate and and like interpret that data um yeah I think what's important to to uh to mention is that there is like a tons of like um JavaScript charting libraries

11:59right um and if you want to build like an app a data app with you know dedicated for the web so you probably into the J JavaScript word right you need to pick up uh you know the the JavaScript um chart library and then you need to bind into your data so you need to find either the connector or the

12:21library to connect to your source and uh and then after you know writing uh some query and fetching to that where evidence abstract this for you so going from zero to one is is much faster right there is existing component and I think we have um sorry um we have uh also this is what I wanted to show for example an existing coment

12:51here is that you define your data we're going to see it uh later in the in the demo um and that's it that's that's a SQL query result basically right yeah so I I think that's an important point so JavaScript um plotting libraries so there's you know High charts echarts um you know uh VJs

13:15there's there's just like so many of them um but they are T typically uh typically pretty pretty easy to get started with but pretty difficult to make a chart that looks really nice you know that like the one that comes out of the box if you're using any of them like plotly whatever you it's sort of it is a

13:33chart but then you're going to spend you know 10 15 minutes writing different configuration to get that chart look looking like good enough to use um which is really boring and then every time you have to make a chart then you're rewriting all of that configuration so evidence is using actually echarts under the hood but we've done a lot of the

13:51configuration so that they look good by default but then you can always change the configuration you know if you want to change the colors or you know how the labels look on a or you know pretty much anything about it within the echart stack um you you can uh you can you can you can modify from there and so see if we if we zoom out a

14:12bit and now that we have a you know high level understanding of what uh evidence is you define yourself as you know bi ice code um and I think that's a pretty new uh recent movement and when I mean recent what is like three three years is

14:32like this maybe I'm wrong but this like at least uh as working for the past 10 years that's the first time I kind of like heard that concept uh what does that concept brings versus like standard dashboarding tools that people use commonly in bi yeah three years is interesting I know evidence been three years I wouldn't say everyone knew about tools

14:56like evidence um three years ago I mean obviously there's a lot of people who still discovering the idea of of B code um but but the idea here is you are rather than you know a typical VI workflow for tools that people are the the most the most popular and most used tools at the moment you know power VI

15:16tblo um looker those kind of things is you are once you have your data in your data warehouse and you know done all of the transformation to get it you know really clean then the final step is someone building reports um in in the tool and it'll be some kind of DUI tool and they'll be dragging and dropping you

15:35know kind of like a pivot table interface you know drag the this Dimension onto the x axis and this Dimension onto the Y this uh um measure

15:45onto the y- axis and I'm going to you know split by this other dimension um which you know is isn't a terrible workflow for creating dashboards um but where it gets really really rough is that um maintain dashboards that have been built like that is is really really difficult um because you know every time you want to

16:06change something across like you know you end up with an accumulation of 50 dashboards you end up having to like go through each of your 50 dashboards and manually like do a drag and drop workload to update it and you know it can be quite error prone and that kind of thing so and and and you know if

16:22you're running a business 90% of your time as a someone working on you know bi and data presentation is actually in the maintenance side of it it's not in the creation creation is is fast but then your stakeholders are just saying oh well this you know there's something wrong with this metric or we need to update this to account for this new

16:39thing that was at least my experience was you're spending a lot of time like tinkering and it's so frustrating when you have to do like loads of manual work to do all that tinkering and you can't test it before you know you release it so yeah that's kind of where bi's code comes in is like you know you're writing

16:56when when you're writing code to Define what your dashboards look like it's much easier um to um to to to update them you know you can just use sort of contrl f to find all instances of the thing you want to update you make the edit you can then run a local copy of it on your machine

17:13to check it you know looks correct um and then you run a deployment process you can run that on a QA Branch so you can deploy it onto your QA maybe you have some test data that you're deploying it with um and you can check

17:28correctly before you finally then you know merge that change into your main code branch which then like deploys a new version of your site so this sort of ideas of having Version Control and having cicd and testing um really allow you to have a much more robust but also sort of enjoyable way of maintaining your dashboards and that's kind of what I

17:52think the magic of evidence is that it's like it's it is there's something really fun and we'll do some local development in a little bit but like local Dev is amazing it's so fast it's Snappy you get like instant responsiveness and you just have feel like so much more in control yeah no but that's a pretty good

18:09summary and I think to bounce back I think in general in data like the local the development workflow have been broken in the sense that we rely on server properity tool where it's really difficult to kind of like mimic what you want to do locally right if I compare to building a website you write some HTML and you get the result right after right

18:34in data you have a dependency with data which makes hard but a lot of like uh historical existing bi tool know has been relying on server and so it's just like kind of difficult to kind of mimic things uh local and then push it to different environment and I think for a different environment um I'm curious your thoughts about that we can ask the

18:58audience um who is like you know dreaming about testing their Beyond environment in you know in different environments so mainly development staging and production so meaning like if you uh change something on a critical dashboard for example you can vot it out into a different environment that people can try before running into production because in my opinion this is U kind of a new

19:25education that's bring to those people where where it's it's it's mostly like software engineering best practice right and those hasn't been applied in my sense on the bi world but guess what like every Matrix every chart that you're are seeing on the on your dashboard is a software asset so you need to you know treat it as a seor

19:49development process so uh you know put some testing using versioning system like git uh to to have peer review being

19:59able to deploy it on multiple Branch but I me yeah I'm curious based on your users like who is kind of saying oh you

20:08know uh I wish I had that I I always dreamed to have that or other people are more oh I didn't know we could do that and we should do that right so what's what's the balance of kind of user you're seeing there well I am obviously we don't talk to to everyone that uses a which is one

20:29of the fun things about running like an open source company is just anyone can download it and try it um but I talk to kind of two two types of people um there's people who are um familiar with using tools like git um and uh Version Control and GitHub and you know have written like writing SQL is like a major

20:52part of that job and often they you know they'll have been using some other bi tool or Vis tool and when they discover evidence they are just generally so excited to try it because you know it's the first tool that has really taken you know the SQL writing experience and added like a a Version Control layer on

21:12top of that and you know a a way to then you know on top of that like also visualize the data in a way that's really presentable you know you can Version Control you know scripts and store store procedures in your database but but that you can't actually then use them to like deliver data to people so

21:30that that that I think is what really kind of unique um and uh and then the other thing that um the other kind of person that we often encounter is someone who's um you know been using maybe Tableau or something they're not not as um familiar with um with you know get or working in the command line or working locally and

21:51things like that but they're often pretty excited about just how How Deeply customizable evidence is you you can get you start to feel quite constrained in most bi tools um because you can't change a lot you know you're always going to have like a grid of charts or like you know you're going to have to use sort of specific you know you can

22:09always tell that a tableau chart is a tableau chart you know they have that specific look and feel um and you know you have a lot more flexibility with with something like evidence um but for those users it's typically a bit more of a learning curve because they have to you know download git for the first time

22:25and and uh learn what it means to like you know source control your code um yeah so yeah those I guess is is is my experiences those two kinds of people so there is kind of like uh tradeoff in the learning curve for people a bit less technical that is kind of like worth it and I think a good analogy is DBT uh DBT

22:46was you know is a framework to run a production SQL Pipeline and usually people from business used to do you know copy paste into their UI or have a schedule button and they they didn't know about gate or terminal and DBT can bring you know that back of those software engineering uh Foundation to them um and I can uh I can

23:12totally relate to what you're saying Archie that uh they are excited they they see that it was actually painful that's not the norm of copy pasting a SQL query under under a Google doc with like you know final V2 uh production query um but they used to do that and I think that's that's the fun part is that

23:32they don't know what they don't know right and then when they discover that there is this principle um which actually are just softare engineering principle but apply to to bi that's that's really nice uh I want to cut off on one point maybe you can show off that you were talking about flexibility can you do you have a couple of example that

23:53we maybe can can see uh I think you shared with me before for but I want to basically see how much of customization you you can go uh today and maybe maybe you can talk about uh you know what it could be in the future let me actually or you want to go ahead and share but I

24:15have the example page yeah okay that's where I was gonna go as well so you it so yeah um so we

24:24just have a gallery of different uh examples of things you can do on evidence but um maybe let's go into um I think um maybe let's pick like a

24:34pretty vanilla evidence project to begin with so I think there's one called um uh Northstar report um I think you

24:43can see that one this is a like a reasonably um standard you know

24:58oh I just lost you for a second maybe it was me yeah and it's okay

25:05yeah so you have um here like a just like a chart up front with a symetrics but you know you can also see you can add like Target lines and um configure zones that you want to like draw people's attention to um you can also add stuff like a lot of commentary around you know why we're doing stuff

25:24and uh I think someone mentioned mermaid before there so this is a mermaid chart you can uh you can put mermaid CH in evidence as well uh so you just like invoke a mermaid component and tell it what you want to look like but but this is really powerful for like explaining to people you know what

25:43why um you know why this report what um

25:47what's it showing me which is often a question people have when they're trying to use your like internal data tools typically yeah like the the if you if you're running like a a data tool in in a business like there's always a slack Channel associated with it which is like constantly people asking questions about your data you know it's like what does

26:05this mean like where does this metric come from uh is this correct like why is this broken it's just like a constant stream of of requests and you can you know as a company as a data team you can write documentation but no one will ever read your documentation is what I find no so to have them actually read it you

26:23have to inline the information into the app um that's a really that's a really great point I think like storytelling I think just dashboarding in general or creating uh you know things uh that is readable in a chart is is already an art right like which kind of colors you're going to go um and you know which kind

26:46of access Dimension you're going to use but the other thing is like that people are completely underestimated even if they have that it's exactly as you said it's like how do you in line definition and information uh because indeed like having separate definition of concept is uh it's pretty hard to maintain and nobody's going to read it uh but having

27:09kind of a story over there um and actually the the best dashboard I ever seen in presentation in my career and I've worked with a lot of bi tools is was notebooks like Jupiter notebooks like hadock reporting from a data scientist that say yeah I'm going to work on a specific like campaign or whatsoever and it has a clear story

27:31because there is those inline information right a notebook kind of like word truth is uh vertical right and I feel like yeah dashboard used to be more like horizontal or you know but as much as you can on the orang of do fix um so yeah that's that's a really um great great point do you have another

27:55example you want to to show yeah why don't you um if you hop back to the examples page um there's uh there's one that's a Community member did about um the the like Titanic um survivors and and and people who were on the the Titanic which I think is a pretty interesting one um and this one highlights kind of like some of the

28:18customization that you can end up doing um so this one I think is um loading

28:24some data from um like a data set and

28:29it's just analyzing how um you know who the passengers were and you know whether they survived and you know what what they look like based on that um but you know you can see some interesting customizations this person's done they they've changed the color scheme um they've also removed the sidebar uh they've changed the logo on the top of

28:47the the header so you know like I think I can still pretty much tell this is an Evidence app by looking at it I recognize enough of like the charts Styles and things but it's definitely getting a little bit more custom um and then I've seen some people do some pretty wild stuff as

29:07well I've seen people do um the the fc24

29:11one which is like some kind of football stats website um it's just next to the Titanic one this one's kind of in my mind pretty mad uh like so this one is uh an Evidence project as well but like at this point I would say you know it's kind of just like a web app like data web app and

29:33things so you you know they got some data in it but then they've got you know all these like player stats and this heat map it's pretty pretty amazing um in lining all this information and I think I think you can drill into some of these I don't know exactly but um yeah

29:49pretty cool um like this is not something you could achieve in like your average bi tool so this is kind of like the top end in terms of like customization you end up doing yeah no that's that's um that's great um to to see I think that's um but

30:08aside from the customization I think it's nice that you have um kind of like

30:15opiated ready component right because I think like as I said um it is pretty hard so to um to just find like what's

30:26the right color to have something you know valuable from from the first from the first go right so I think it's it's also a balance maybe you've you've been thinking internally it's like how much customization we we want to offer it's like no no you don't want customization otherwise it's gonna be a huge mess right finding this right on how much

30:49template you want to give it it's an interesting point I think um the good thing about um having your um defining your like reports in code is that you don't have to show the user all of the options like all of the time um if they don't want to enable them so what what we strive for is like it looks

31:11great out of the box with no customization and then if you want to you can just add customization properties to it so you know you start with a basic you know bar chart and then you're like okay actually I want to change the colors so I'm going to do fill color equals blue and then I want to change this other thing so unlike if

31:28you're if you're doing like a more of a like GUI driven experience like if you the user wants to find like a specific thing to edit they're going to have like a long menu with like a hundred different things potentially and so at that point it makes sense to cut down the number of things you can you can

31:44customize because otherwise you could never present them all to the user um whereas you know in a code-based experience it's it's just additional properties that the user can add themselves so so we basically allow you to change pretty much anything about your charts um because it doesn't really add burden to the person looking at the code or the person writing the

32:04code so we haven't speak about how is it how Doug DB you know is fitting evidence ecosystem maybe you can tell us about that because we have obviously it's on the mother deck Channel a lot of people familiar and loving du DB so how how are you using Doug Deb with in evidence yeah there's um let me share my

32:28uh screen for a second

32:33so so uh let me go to

32:41here so the um evidence uses duct DV in kind of two different ways um so the first is evidence um supports duct DV is a data source so you know evidence has a lot of different connectors you can connect pretty much any major um data warehouse to evidence you know bigquery post grasp um Amazon uh s Etc and duck DB is also um

33:08an option there so you know if you have a duck DB file or CSV files or par files

33:14which also can be used by duv you can include them in your project as as data sources um so and that that's nice you can use you know Dub's really permissive SQL syntax and to write queries there and you know it's pretty compressed and you can publish it you know alongside your app so you don't need to you know

33:32if you don't want to you don't want to add your credentials you can do that um but that's kind of not the most interesting way that um evidence uses de Tob um the most interesting way is actually inside the the markdown Pages itself so because um you know if you're

33:50writing SQL normally you have the option you know you could write SQL against your bigquery database and you could also write SQL against your postgress database but then you wouldn't be able to like join those two data sets together using SQL because they use different SQ syntaxes and because underneath there's you know two different databases so evidence actually ships

34:12with a duct DB engine um in the browser

34:17so we're using duck DB's web web assembly uh package um which is a really cool project um and what that does is it means you have a database engine um running inside inside uh inside your browsers you know in Chrome or ar or whatever you're using um and you can then query run execute new queries on the Fly um inside the browser and what

34:41that means is if you've got you know five different data sources you can be joining those data sources in the browser you can be creating interactivity in the browser so you can make little inputs you know drop downs that user select yeah and when they when they do a filter action the DB in the browser is going to run that filter

34:59action because it's in the browser it's just lightning fast you know you're not waiting to communicate with the database to do that and that that's really quite different to to how most VI tools would work in that situation yeah I think that there is like to be fair bi tools implementing cache mechanism um but is it is like uh you

35:20know propriatary kind of like technology which has some limitation so I think the opportunity there is that you know the caching can be done from any kind of source right um and ddb has

35:37ddb has just this this open file format that you can also leverage when when you want to work uh to to work locally do you have what's your um what's your thoughts about Dub wasman where where do you see uh it could fit in evidence if you have any work worked on on that sorry question asking me yeah yeah

36:05so where where do you see ddb wasm um and maybe like explaining quickly how what what's the difference between um you know D DB1 because you're using the node module right as I believe so we're doing we're doing we're using two different parts so in when duct DB is a data source then we're using um just normal uh du DB the node

36:29the node package um yeah and that and that's going to run it you know in uh on the server when um when someone is like trying to run queries against their duct DD and it's going to create them like basically a cache that they can query but the um when we're combining sources in the browser we're running the the WM

36:48version of duct DD okay I see so we're actually using using both yeah that I wasn't sure so that's why I was asking okay um so yeah so Doug DB was is so that's the power maybe we can I just to to explain to the audience that so Doug DB can run in process in you know no GS

37:09or uh python goang but it can also run in the browser to web assembly so web assembly is a technology where you can run you know a lowlevel code for high performance like C+ Plus or R um and it's a it's kind of a container and it's run in your browser so there is no uh external dependency and so that's that

37:32that's also the beauty because then low latency uh and everything is on the client and now I think the next step is H if you work with Moder deck um so modu

37:44is also working with on a was MDK right

37:48um and so you can have your data on the server and still on the and cash on the client so that's what a lot of company are starting to do is basically be able to easily hydrate new data on the client and then I avoid any compute on the server because here we are talking about you know develop and

38:08workflow when you run locally but what happens when you you know um need to basically share your uh uh share your uh

38:18your dashboard have you seen people what kind of part did you see some patterns uh using Doug DB and sharing like do they put it on three or did you see other patterns than using mod deck to see working on the server side um so I've seen um are you talking

38:38about where people are deploying the websites or where people are put hosting their duct DB I think it's it's an interesting point maybe we should talk first uh on like where where we haven't mentioned really I think quickly so we can mention maybe you can explain like what's the next step after you you know you have your your evidence dashboard working

38:59locally to deploy it and then where do you host the database yeah yeah so um evidence um is a web app

39:09that builds at the end of the process it builds a static website so well that means is you you know you finished developing and you've um written all your code and then you can run it on your computer you can run a build so you run you know mpm run build or if you have you know a deployment service

39:25that's going to be like running this it's going to be running this for you when you merge your code and what that does is it compiles all the markdown that you've written uh and turns it into HTML and JavaScript and CSS and you you know you don't really need to know about that but that's kind of what it's what

39:40it's building it's building this this like vanilla website um and it's a static website which means that it doesn't have a server at all um the you

39:52know you you host it on some kind of server platform you know somewhere where you can host those files um and that's all you need so there's no live connection to the database that you're you know or the databases that you're running with when the app is running it's just um connected to uh the file server where you're hosting these

40:10like HTML JavaScript CSS files yeah so that can run on where where could inrun pra to make like you know verel ntif fly

40:21or evidence that's what you that's what you're offering right yeah so um so

40:28evidence has like a a paid product which is evidence Cloud which is probably the easiest way to host your evidence website but because it's open source you can self host forell and netfi are good options for you more sort of like hobby projects but you know if you're deploying inside your business you might host it on AWS or gcp or um you know

40:48some other platform like that um and that allows allows you to kind of mean that the data if if you're running a business you often have like requirements in your data sovereignty you don't want it to ever leave your servers um and at that point uh you know you might you might run it on um you know Azure or something like that um so

41:07it's really flexible you can you can deploy it anywhere um and you know having having that s someone's just asked a question um maybe you can and pop it up which I is a really good question of course um from yes around so you need to rebuild and deploy anytime you want to refresh your dashboard so you um you kind of have two options

41:32here um you whenever you whenever you merge a change to um your code uh then

41:39it will automatically rebuild and redeploy your dashboard so you if you merge it to your main branch of your of your code then it will um but then the the other thing you can do is you can set up a schedule to just refresh your data for you so you know you can set that up to be as regular as you want um

41:56you know many businesses or companies will run it like once a day you know once their you know uh data pipelines are finished running in the morning then they'll update then um but I've seen people doing it you know once an hour um once every five minutes so you can you can get pretty low latency uh like that

42:13and and those those are just um you know running rebuilds and because it's a you know website those builds can be re reasonably fast and which means you can get to pretty low latency like that but yeah you you need to you need to rebuild to update the data in short yeah so that's a that's a good

42:33point we are running um already uh on time so let's let's go dive a bit into the code so um maybe you could go on a live server to show a bit uh the interactivity and what we have been discussed of this you know JavaScript web server running uh head SQL queries

42:54uh and having those components and having the view so I'll add to to yeah

43:01stage here so okay so um I've got this

43:06um evidence project running on my local computer um and let me kind of you know this is um this project is actually a a dive into San Francisco's um 311 tickets that are filed so this you know you've got the emergency number 911 and then if you just have like a service request um you know maybe like there's some graffi

43:27or something like that um then you can file with 311 instead um so we're just exploring some of that data um and I've

43:36got here um this is running on my local computer so it's on Local Host and on the this side I've got my you know my markdown file which is turning into evidence so if I make changes to my my markdown file then it will change the um the app as I'm developing it um which is cool um

43:57and you know we talking a little bit about um you know how evidence works so the first thing you have is is SQL queries so um you know I've got this SQL query here maybe let me format it to make it a bit easier to read it's just um selecting uh the first date in the range the last date in the range and the

44:16total number of cases um and that is you

44:20know running SQL against my my data source that I have which is you know ADB in this case um and then once I've got that data I'm then able to use it so um

44:33I've got my query this is called description so I can see it here um and then I can see the data that it's returning from the database and you know if I wanted to change this um you know I wanted to do like uh some other operation here it would then automatically rerun here I'll do one of

44:50those in a little bit and then below that I'm using components so uh here here I have a component um which is this value component and then uh I'm passing it the query called description uh and I'm asking it for the count of the description and then I'm also determining you know what the um what the number format is here so all of

45:14these things are then being used to look and say oh this data includes 100,000 rows and that's the that's the number that I'm pulling here and if I want to make a change maybe I wanted to show explicitly rather than with 100K I've changed the number format there um and yeah because this is um you know markdown it's not not really Whit

45:38space sensitive um for these components so I can that's why I was just rearranging that just to make it easier to read so you got and just one thing I want to point out is that you do have the SQL query the original SQ query and I think that's that's also something that's missing in standard bi tool

45:57because it's a bit less like track and drop so there is SQL query behind the scene which is happening but you don't see them and so if you want to track kind of a lineage things or saying how this this thing was calculated it is super hard we've dedicates uh you know uh bi2 where it's a interface and then there is an

46:18abstraction layer where here it's it's just those squal queries so you can directly inspect them yeah that's that's a good point so um every query every bit of data that's on the page has to have a query associated with it so you know I'm looking at this trend data obviously I can like download the data if I want or

46:37have a look at data but more interesting is like okay where did this data from like which which data which table was it from you know which columns what query did I run to get it I'm curious for the for the download download data do you did you did you develop that because people were asking if they can have it

46:57in Excel yeah 100% I mean uh you know you people are always

47:04you know you you should bring bring the data to where people are at and uh you know in businesses it's Google Sheets it's Excel so you've always got to be able to have that button which is like oh I just what if I want to get this into Excel so uh you know you can uh you get in a CSD open Excel they can uh they

47:22can get going um yeah but let me let me a couple

47:27of other features um of how an Evidence project works so I've just had this main index page open here so this this sort of 60 lines of code or so 80 lines of code um defin this page um I've got a few different things here I've got this Dimension grid I've got which is this thing here um and I've got this calendar

47:46heat map which is displaying the uh the data just like on a daily basis so I can see how many cases there were per day um but uh I've got some other pages in The as well um and I've got some other things that I'd use to you know to create this so let me just show you a

48:01little bit more about how an Evidence project looks so an Evidence project it's it's um it's a node app which means you know a Javascript app that has um uh that can run JavaScript on the server um and you know these these two files are basically just um uh logging the dependencies for the Javascript app and then the

48:24interesting stuff is basically happening in um pages and sources so sources is

48:30where you define any data sources that you want to connect to evidence so I said you could create like a you know duct DV or big query or snowflake each one of these you add as a new source in this sources folder and here I'm using du DB so this is a du DB file uh and then I have a configuration file which

48:50is telling telling me um you know uh what which file to point at and then I have a SQL query and this is just a duct DB query and it's saying okay I want to go get all of the data from this SF 311 table um and the fact that it's called

49:08cases um and the source is called sf311

49:12that means that's why when I have my SQL queries I'm running it from sf311 do cases so it's it's turning all of these queries that I have in my sources into tables that I can access in my markdown um but I don't just have this inex page you know I've got some other Pages um I've got a page for each of the

49:33neighborhoods um in San Francisco as well um and that's defined by this neighborhoods folder and then the index page the neighborhoods um this is pretty interesting so I can you know go look at uh Mission and uh see see what things look like in there or you know out of Sunset and I can just look at the cases

49:54there and you know what um you know what category of things are happening there um yeah so that's uh go

50:03question yeah we have a we have an interesting question here for many uh bi tools uh EMB bidding

50:12reports into web application can be costly into a s style application and I think we have kind of a related uh question which is uh this one uh are we

50:24able to start for and then build to in additional features like a standard I mean dashboard so I think what's happening and I've used that in my experience is that you have a SAS application whatever you have offering let's say you're LinkedIn and you have an analytics page there right and usually they contact the data teams and

50:45the data teams said yeah we have our bi tools you know which are propriatary and so on but and so you have basically the product engering team that need to build custom shop charts right to kind of like push back what is used for decision for bi right internally to the to the internal product let's say uh if you're

51:07if you uh if you are LinkedIn and you want to provide analytics um to other users so how do you see maybe you can first ask the the the the the easy question right this one and then I'm curious to to to hear your take on how do you see basically people pushing back you know operational analytics that I

51:30like to to say yeah so the easy question you're saying on embedding how you embed evidence yeah I think the easy question is like on regarding the fees like the one I I I ped like here right now sorry

51:44I need to have a look at streamyard again yeah yeah um so okay so uh the so

51:51if there is if if the cost you know is per VI Okay so okay so there are a couple of different ways to embed evidence um you can have uh evidence you can just embed evidence as an high frame um in your app um so you have um some kind of like authentication layer that you know is

52:15getting people into your app you can um build evidence websites and embed inside that app um and you know you can remove like the sidebar and the the nav bar and the various like different bits and pieces so you just have like the core like chart layout um and that embeds really well in in apps so you know and

52:36in terms of like hosting Arrangements like you can host you can self host that and deploy that on your own infrastructure um you can uh deploy it on evidence cloud like th those kind of things are like you kind of have like your choice there um so um if you want

52:53to embed evidence dashboards and cell poost you're not going to pay pay for anything um on evidence Cloud there there's like an embedded option as well um and I think that's like quite dependent on like how many customers or like businesses or end users you you want to like have look at stuff um but it it's certainly less expensive than

53:13the like tablos and the powerbi of the world like you can you can be spending like half a million or a million dollars on on embeded Solutions those products it's it's uh they're really expensive for sure yeah um and so what's what's your so you actually aler my other question was like how do you do people inid it uh

53:36evidences of like mostly through iframe right that's what what what you see yeah um and then you know you can also pass if you need to like pass information from your authentication system into the iframe to determine like which report you're selling that that kind of thing so you know I um if I have like you know 10 different customers um

54:02I can read in from my authentication system and then pass that token into the iframe and then it will fetch the right report for that customer and so that that's kind of the like the setup if you're trying to do like a multi multi-tenant deployment of evidence

54:28I think I've uh lost your audio M maybe maybe you're on me I'm not sure yeah sorry uh sorry I was top for for a sec um so first off the product looks really great we have another uh question here uh knowing money bi tools and bided reporting can be oh actually it's uh kind of a duplicate sorry it's LinkedIn

54:48being uh LinkedIn so we already answer that um what I want to di for uh closing is another example uh that um I've also

55:01uh work with evidence and Doug DB and mod duck uh is duck DB stats.com um and so you have uh actually let me share you

55:13directly in the repo um so it is an end to end data engineering project and you can actually uh actually if you go to the stats you have the the source code link uh here uh that's actually a duplicate uh but what is great with that is that um basically it's get PPI stats data from dug DP and

55:40you can actually if you have a python Library can actually clone the project and just change one variable and start U you know feeding data on another python Library if you want um it's using evidence and it's using mod deck basically to hydrate the data so whenever there is new data you know happening uh to mother duck there is a

56:03job as Arie saying on the evidence Cloud which is running once a day and uh getting refresh and what is actually nice I found here we were talking about you know the story Etc is that in uh in

56:17moddu you have this uh mechanism of sharing you know data set and whenever you have wdb client and so the DCT the the share that I said is here I actually uh just uh Chang it recently so it might

56:33be a fail live demo I didn't plan to do live demo um but so here is just to show you uh I have you know a local duct TB uh CLI and I'm G to connect uh to mod

56:47duck with just doing attach I have a token to authenticate to me to uh to the cloud and now if I show all the database I have all the database actually uh here and uh for example I think I have already uh share it it's already here but I can attach it secondly so no that's the share link right which is on the ddb

57:14stats website and now I can query the the data I can start I have all the data row data which is feeding uh the the report so I think you have um PPI daily stats and so this is the final table and so you can do you can also run describe uh here um and you see you can

57:41have a look directly at the row data that someone has shared and I think that's that's also the beauty of it I I wish like if like everything related to

57:52open data sets right that you have a dashboard and you can that much quickly you know query the row data set be amazing yeah that would be amazing instead of having a Google somewhere and yeah the cool thing about that Med is that anyone could have done that right because obviously you're logged in as you and you made the data set but I

58:15could log in and and and do the same exact this data set is sure yeah this that's a really good point this data set is sure and it's public uh so I invite you to play around uh also to to look at the website we're closing in um I think we had a lot a couple of quite uh nice

58:34question is there anything uh you want to add on the future of evidence like what's what are you hype for what's the next next thing that you are looking into in term of features I think um I'm really excited about kind of quite a lot of things but um the uh thing that we're really interested in is like how you can build

58:57like really amazing interactive experiences so one of the things that kind of can be really cool um when you're building like data apps is having like because you can explore the data the end user can explore the data really on their own building more capability around you know drilling into different things you click this part of a chart

59:17and then you pull up this data that expands into it all that kind of stuff is is stuff we're working on um right now and is is kind of really interesting to me I think um it just makes it so much more of a um you know so much more of an intuitive experience to try and understand what's going on in your

59:34organization or business you know it's what it's how people are used to experiencing um you know apps now it's just like you know with instant client interactivity so so I think that that'll be that'll be really great and we've got a lot of releases coming out with that I think in the next couple of months cool uh on that you are right in

59:53time for a stand up for people I guess in the west coast maybe uh but uh yeah that that was amazing thank you so much Archie for uh coming in and I think we'll see you yeah we'll see you around um probably maybe uh later let's let's let's see each other in in six months like where where do you stand and

60:16where mod stand because I think it's also nice right Doug DB is evolving a lot mod deck is evolving a lot so I think it's also give opportunity to different product in the ecosystem like like you uh folks uh to see how how things are being evolved um quacking code is every month uh the live will be on YouTube

60:38afterwards uh you can check mod duck uh TV channel uh on YouTube and yeah have a

60:45great day a great stand up wherever you are or a great evening all right

FAQS

What is Evidence and how does it use SQL and Markdown for data visualization?

Evidence is an open-source tool for building data apps using Markdown and SQL. You connect your database, write SQL queries, and embed results into Markdown pages using built-in chart components (bar charts, line charts, maps, tables, etc.). The output is a website or web app that can be deployed as a static site. Evidence provides opinionated, good-looking defaults out of the box while allowing deep customization when needed. Learn more about the Evidence integration.

How does Evidence use DuckDB in the browser for interactive dashboards?

Evidence uses DuckDB in two ways: as a data source connector (reading DuckDB files, CSVs, or Parquet files) and as an in-browser engine via DuckDB's WebAssembly package. The in-browser DuckDB engine lets users run filter actions, dropdowns, and interactive queries with very fast response times since no server round-trip is required. This also allows joining data from multiple different data sources directly in the browser. For more on this technology, see our guide on DuckDB WASM for web data exploration.

What are the advantages of BI-as-code tools like Evidence over traditional dashboarding?

BI-as-code tools let you define dashboards in code (SQL + Markdown), which enables version control with Git, CI/CD workflows, testing on QA branches before deploying to production, and easy bulk updates across dashboards using find-and-replace. Traditional drag-and-drop BI tools make creation fast but maintenance painful. Updating 50 dashboards one by one is error-prone and tedious. Code-based dashboards also support inline commentary and documentation, which helps stakeholders understand the data without consulting a separate wiki.

How do you deploy and refresh an Evidence dashboard?

Evidence builds a static website from your Markdown and SQL code. You can host it on platforms like Vercel, Netlify, AWS, GCP, Azure, or Evidence Cloud. Since there is no live database connection in the deployed app, you refresh data by rebuilding the site, either on a schedule (hourly, daily, etc.) or whenever you merge code changes to your main branch. The rebuild process re-runs all SQL queries, caches results, and produces updated HTML, JavaScript, and CSS files.

Related Videos

"The MCP Sessions - Vol 2: Supply Chain Analytics" video thumbnail

2026-01-21

The MCP Sessions - Vol 2: Supply Chain Analytics

Jacob and Alex from MotherDuck query data using the MotherDuck MCP. Watch as they analyze 180,000 rows of shipment data through conversational AI, uncovering late delivery patterns, profitability insights, and operational trends with no SQL required!

Stream

AI, ML and LLMs

MotherDuck Features

SQL

BI & Visualization

Tutorial

" The MCP Sessions Vol. 1: Sports Analytics" video thumbnail

2026-01-13

The MCP Sessions Vol. 1: Sports Analytics

Watch us dive into NFL playoff odds and PGA Tour stats using using MotherDuck's MCP server with Claude. See how to analyze data, build visualizations, and iterate on insights in real-time using natural language queries and DuckDB.

AI, ML and LLMs

SQL

MotherDuck Features

Tutorial

BI & Visualization

Ecosystem

"LLMs Meet Data Warehouses: Reliable AI Agents for Business Analytics" video thumbnail

2025-11-19

LLMs Meet Data Warehouses: Reliable AI Agents for Business Analytics

LLMs excel at natural language understanding but struggle with factual accuracy when aggregating business data. Ryan Boyd explores the architectural patterns needed to make LLMs work effectively alongside analytics databases.

AI, ML and LLMs

MotherDuck Features

SQL

Talk

Python

BI & Visualization