We Built Our Employees a Wrapped—Using SQL and MotherDuck
2025/12/29 - 6 min read
BYSpotify Wrapped gets 200 million people to voluntarily share their listening data every December. It's basically an end of year performance deck, except people actually want to look at it. One stat at a time, with a ranking at the end.
It's also, technically, not that complicated:
To ring in the new year, I decided to build a Wrapped for our team at MotherDuck.
I'm Hannah, a Customer Engineer at MotherDuck. We had the data, so I built it. Once the draft circulated, people had opinions—mostly about why their metric should rank higher.
Below: the metrics we tracked, the SQL behind them, and the duck personas we assigned. About an hour of work total.
The leaderboard
I ran 1.15 million queries this year—3x more than second place. Before you're too impressed: a lot of those were probably me re-running the same broken query until it worked. I also shared 54 databases with the team, more than anyone else. Did I include that metric because I knew I'd just shared 20 snapshots of the same database? Absolutely.
Elena (Ecosystems Engineering) had the longest streak: 182 consecutive days of query activity. Six months, no gaps. That's either impressive dedication or a sign we should check on her.
Alex (DevRel) created 2,176 databases. On average, that’s six databases a day.
Gaby (Support Engineer) processed 217 terabytes—more than double anyone else. Probably debugging someone else's problems.
Leonardo (Customer Engineer) logged 211 active days while running 681K queries. 👀
How I built it
I pointed our MotherDuck MCP at the internal data warehouse and quickly oriented to the data we had available in our MotherDuck account.
The data was already there—tables tracking queries, databases, and shares. The work was:
- Union two data sources (one covered Jan–Apr, one covered May–Dec)
- Calculate percentiles, streaks, and totals per user
- Filter out test accounts and service accounts
- Assign archetypes based on thresholds
- Materialize it into a table
Filtering out the noise
The trickiest part was filtering. First pass had an internal service account at #1 with 115 million queries. Impressive streak, but not an employee success story. With the MCP, I could iterate quickly—run a query, see who surfaced, add another exclusion, repeat until the list was actually employees.
Copy code
internal_users AS (
SELECT id::VARCHAR as user_id
FROM current_users
WHERE is_motherduck = true
AND is_motherduck_test = false
AND (is_service_account = false OR is_service_account IS NULL)
AND email NOT LIKE '%+%' -- Exclude test aliases
AND email NOT LIKE '%\_sa@%' ESCAPE '\' -- Exclude service accounts
)
NOTE: Service accounts, test aliases, and automated processes can easily dominate your metrics. Always filter these out before running any "top users" analysis.
Total queries
We pull this from our aggregated daily stats tables. One row per user with everything we need: total queries, terabytes processed, hours of execution time, and active days. The MIN dates give us their first activity—useful for identifying new members of the team.
Copy code
user_totals AS (
SELECT
q.user_id,
SUM(q.queries) as total_queries,
ROUND(SUM(q.total_io_bytes_gb) / 1000, 2) as tb_processed,
ROUND(SUM(q.execution_time_seconds) / 3600, 1) as total_hours,
COUNT(DISTINCT q.dt) as active_days,
MIN(q.dt) as first_active_date,
MAX(q.dt) as last_active_date
FROM all_query_stats q
INNER JOIN internal_users r ON q.user_id = r.user_id
GROUP BY ALL
)
The percentile
This is what gets screenshotted:
Copy code
percentiles AS (
SELECT
user_id,
total_queries,
ROUND(100 - (PERCENT_RANK() OVER (ORDER BY total_queries) * 100), 1) as top_percentile
FROM user_totals
)
PERCENT_RANK() returns a value between 0 and 1. We flip it with 100 - so that lower numbers mean higher rank. Top 1% means you're ahead of 99% of people.
The streak
Longest consecutive days of activity.
TIP: Consecutive dates, when you subtract a row number, map to the same value. Group by that, count the rows, and you've got streak lengths. This pattern works for any "longest consecutive X" problem.
Copy code
streaks AS (
WITH daily_activity AS (
SELECT DISTINCT
q.user_id,
q.dt as activity_date
FROM all_query_stats q
INNER JOIN internal_users r ON q.user_id = r.user_id
WHERE q.queries > 0
),
streak_groups AS (
SELECT
user_id,
activity_date,
activity_date - (ROW_NUMBER() OVER (
PARTITION BY user_id ORDER BY activity_date
))::INT as streak_group
FROM daily_activity
),
streak_lengths AS (
SELECT
user_id,
streak_group,
COUNT(*) as streak_length
FROM streak_groups
GROUP BY user_id, streak_group
)
SELECT
user_id,
MAX(streak_length) as longest_streak
FROM streak_lengths
GROUP BY ALL
)
The builder stats
databases_created shows who's spinning up new projects. databases_shared shows who's collaborating. We also split out DuckLake databases specifically. I was curious to learn which employees have already adopted DuckLake into their workflows.
Copy code
databases_created AS (
SELECT
d.owner_id as user_id,
COUNT(*) as databases_created,
COUNT(*) FILTER (WHERE d.md_database_type = 'ducklake') as ducklake_dbs_created
FROM current_databases d
INNER JOIN internal_users r ON d.owner_id = r.user_id
WHERE d.created_ts >= '2025-01-01' AND d.created_ts < '2026-01-01'
AND d.owner_type = 'user'
GROUP BY ALL
),
databases_shared AS (
SELECT
s.owner_id::VARCHAR as user_id,
COUNT(*) as databases_shared
FROM current_shares s
INNER JOIN internal_users r ON s.owner_id::VARCHAR = r.user_id
WHERE s.created_ts >= '2025-01-01' AND s.created_ts < '2026-01-01'
GROUP BY ALL
)
The archetypes
Spotify has "Audio Auras." We have duck personas. The order matters. CASE statements evaluate top to bottom, so Elite Duck (top 1%) takes priority over Streak Master. Someone could qualify for multiple archetypes— ordering helps assign the most impressive one first.
Copy code
CASE
WHEN p.top_percentile <= 1 THEN 'Elite Duck 🏆'
WHEN p.top_percentile <= 5 THEN 'Power User ⚡'
WHEN p.top_percentile <= 10 THEN 'Super Quacker 🦆'
WHEN COALESCE(st.longest_streak, 0) > 30 THEN 'Streak Master 🔥'
WHEN t.active_days > 200 THEN 'Steady Builder 🏗️'
WHEN COALESCE(ds.databases_shared, 0) >= 3 THEN 'Sharing Champion 🤝'
WHEN COALESCE(dc.databases_created, 0) > 50 THEN 'Database Architect 🏛️'
WHEN t.tb_processed > 100 THEN 'Data Cruncher 📊'
ELSE 'Rising Duck 🌟'
END as archetype
Elena is a Streak Master. Alex is a Database Architect. And yes, I gave myself Elite Duck. My family was thrilled to hear about the new title.
The final query
The full query joins the CTEs and writes to a table. Materializing means we can share it directly without giving people access to the raw data—and without re-running the aggregation every time someone wants to see their stats. For an internal project, this doesn't matter much. If we were serving a Wrapped to customers, it would.
Copy code
CREATE OR REPLACE TABLE wrapped_2025_internal AS
WITH internal_users AS (...),
all_query_stats AS (...),
user_totals AS (...),
percentiles AS (...),
databases_created AS (...),
databases_shared AS (...),
streaks AS (...)
SELECT
t.user_id,
t.total_queries,
t.tb_processed,
t.total_hours,
t.active_days,
t.first_active_date,
t.last_active_date,
p.top_percentile,
COALESCE(dc.databases_created, 0) as databases_created,
COALESCE(dc.ducklake_dbs_created, 0) as ducklake_dbs_created,
COALESCE(ds.databases_shared, 0) as databases_shared,
COALESCE(st.longest_streak, 0) as longest_streak,
-- archetype CASE statement (shown above)
FROM user_totals t
JOIN percentiles p ON t.user_id = p.user_id
LEFT JOIN databases_created dc ON t.user_id = dc.user_id
LEFT JOIN databases_shared ds ON t.user_id = ds.user_id
LEFT JOIN streaks st ON t.user_id = st.user_id
Building your own "Wrapped"
A few things that helped:
- Start with what you can actually query. We had daily stats tables. Work with what exists before building new instrumentation.
- Filter aggressively. Our first pass had a service account at #1 with 115 million queries. Impressive, but not the user type we were targeting.
- Make it personal. People care about their own numbers. A company-wide total is interesting; "you're in the top 5%" is shareable.
End of year is a good time to count things. Might as well make it fun. Happy holidays from the MotherDuck team. May your queries be fast and your streaks unbroken.
Start using MotherDuck now!



