Small Data SF Returns November 4-5, 2025: First Speakers Announced

2025/07/17 - 7 min read

The Small Data movement is gaining momentum, and we're thrilled to announce that Small Data SF is returning to San Francisco on November 4-5, 2025! After an incredible inaugural event that brought together over 260 attendees, we're back with another two days of workshops and talks that challenge the "bigger is always better" mentality in data and AI.

Highlights from 2024

What Makes Small Data Different?

Small Data isn't just about data that fits on a single machine—it's a philosophy that embraces:

Efficiency in making big data feel small: Using smart techniques to process massive datasets as if they were manageable
Processing data in smaller pieces before it gets too big: Preventing data sprawl through intelligent preprocessing and aggregation
Local-first development: Building on your laptop and shipping to production with the same tools
Simplicity over unnecessary scale: Choosing the right tool for the actual problem, not the hypothetical one

As attendees from last year told us, this approach resonates deeply with practitioners who are tired of over-engineered solutions:

"There is tremendous power and value in working with smaller datasets. I wish more people attended this conference and realized this!"

What about Small AI?

This same philosophy around Small Data also applies to Small AI, as Jeffrey Morgan (co-creator of Ollama) shared with us last year:

Speed and Performance - Small models run significantly faster than large models due to fewer parameters (computational time is quadratic with parameter count), and when deployed locally they also benefit from zero network latency
Deployment Options - Small models offer flexibility in deployment - whether local, cloud, or hybrid without being locked into specific cloud providers or infrastructure requirements
Practical Applications - Small models excel when combined with existing data through techniques like Retrieval Augmented Generation (RAG) and tool calling, making them ideal for internal tooling, help desk automation, and developer productivity rather than general knowledge tasks

Small models aren't just "worse versions" of large models - they're optimized for different use cases where speed, efficiency, and deployment flexibility matter more than having vast amounts of factual knowledge.

Meet Our First Flock of 2025 Speakers

"I attend a lot of conferences, but Small Data SF was on another level. The lineup was unbeatable, the content was razor-sharp, and the people were next-level inspiring."

We're excited to announce our initial lineup of speakers who are shaping the future of efficient data processing and AI. Interested in joining the lineup, reach out with your idea to speakers@smalldatasf.com.

Adi Polak

Author of O'Reilly's "Scaling Machine Learning with Spark"

Adi brings deep expertise in distributed systems and machine learning, having literally written the book on scaling ML workloads. Her insights on when to scale—and when not to—make her a perfect voice for the Small Data movement. She is currently the director of Advocacy and Developer Experience Engineering at Confluent

Benn Stancil

Former Co-founder and Chief Analytics Officer at Mode

A thought leader in modern data practices, Benn has been vocal about rethinking how we approach data infrastructure and analytics. His perspective on building practical data solutions over theoretical ones aligns perfectly with Small Data principles.

Benn is returning this year after giving an energetic and enlightening talk at Small Data SF 2024.

George Fraser

CEO, Fivetran

Leading one of the most successful data integration companies, George has unique insights into how data actually moves through organizations. His analysis showing that 99.5% of queries could run on a laptop was a highlight of last year's conference.

Holden Karau

Apache Spark PMC; author of Multiple O'Reilly Books on Apache Spark + ML

As someone deeply involved in one of the most popular big data frameworks, Holden brings a nuanced perspective on when distributed computing makes sense—and when it's overkill. She is co-founder of Fight Health Insurance, where they aim to make it easier to appeal medical health insurance denials.

Joe Reis

Author of O'Reilly's "Fundamentals of Data Engineering"

Joe's pragmatic approach to data engineering has influenced thousands of practitioners. His book emphasizes building sustainable, appropriate solutions rather than chasing the latest trends.

Jordan Tigani

Co-founder & CEO, MotherDuck

As one of the original creators of BigQuery at Google, Jordan has a unique perspective on the evolution from "big data at all costs" to "right-sized data for the problem." His keynote last year on why "Big Data is Dead" set the tone for the entire movement.

Ravin Kumar

Senior Researcher, Google DeepMind

Bringing cutting-edge AI research perspective, Ravin's work demonstrates how smaller, more efficient models can often outperform their larger counterparts—a key theme in the Small Data philosophy.

Sam Alexander

Principal AI Engineer & Product Developer, Extended Play

Sam's experience building practical AI applications highlights the importance of choosing the right scale for real-world problems, not theoretical benchmarks. Sam has deep vertical expertise in music, media & mental health and serves as a fractional CTO focused on tangible solutions for real customer problems.

Why Small Data SF Matters Now More Than Ever

Last year's conference validated what many practitioners have been feeling: the reflexive reach for distributed systems and massive scale often creates more problems than it solves. As one attendee noted:

"Just got back from Small Data SF. It's fascinating how we're seeing this shift from 'big' to 'small' — not in terms of scale but in terms of focus and efficiency."

The feedback was overwhelming:

"Small Data SF was such an incredible experience. I enjoyed meeting and learning from folks who are so excited to build something new and different for this new era of data analytics and warehousing." - Koosha Totonchi, MetricForge Analytics Inc.

"I came out of Small Data SF buzzing with ideas on how the data stack may evolve into the future. Thank you for organizing a flawless event and gathering this fantastic community together. Hope to see this continue in 2025." - Gilad Lotan, Buzzfeed

Join Us for Two Days of Practical Innovation

Day 1 (November 4): Hands-on workshops where you'll learn practical techniques for efficient data processing, local-first development, and building AI applications that don't require a cluster to run.

Day 2 (November 5): A full day of talks from industry leaders who are redefining what's possible when you optimize for simplicity, speed, and developer experience rather than theoretical scale.

Thanks to our Sponsors

Our friends at bem, Estuary and Omni have signed on as Gold sponsors to help support the event. If you want to join them and support the Small Data and AI movement, please reach out to sponsors@smalldatasf.com.

Last year's sponsors included: Turso, Ollama, Evidence, Omni, dltHub, Cloudflare, Tigris, Outerbase, Posit and Essence. We're grateful for their support of the inaugural event (and expect many will return this year!).

Register Now for Early Bird Pricing

Early bird tickets are just $295 for both days—a fraction of the cost of typical data conferences, in keeping with our efficiency-first philosophy. This special pricing is only available until August 4th, so register now to secure your spot.

Join us in San Francisco this November to be part of a movement that's making data work smaller, faster, and smarter. Because in 2025, the question isn't "How big is your data?" but "How efficiently can you process it?"

As one attendee perfectly summarized:

"Small Data SF sets a new standard for data conferences!" - Celina Wong, Data Culture

We can't wait to see you there.

Reserve your spot today →

Start using MotherDuck now!

Try 7 Days Free

2025/07/15 - Mehdi Ouazza

Teaching Your LLM About DuckDB the Right Way: How to Fix Outdated Documentation

Learn how to keep LLMs updated with llms.txt and Cursor's docs feature.

2025/07/17 - Ryan Boyd

Introducing Mega and Giga Ducklings: Scaling Up, Way Up

New MotherDuck instance sizes allow data warehousing users more flexibility for complex queries and transformations. Need more compute to scale up? Megas and Gigas will help!

View all