New: The AI Analytics Eval Field GuideGet the Free Playbook

Skip to main content

Hacker News

Explore the data

Interactive dashboard built on the Hacker News sample dataset. Use it as a starting point for your own Dives.

Loading live Dive preview...

About the dataset

Hacker News is a social news website focusing on computer science and entrepreneurship. It is run by Y Combinator, a startup accelerator, and it's known for its minimalist interface. Users can post stories (such as links to articles), comment on them, and vote them up or down, affecting their visibility.

There are two ways to access the dataset:

  • Through the sample_data database, which contains a sample of the data (from January 2022 to November 2022). This database is automatically attached to every MotherDuck account.
  • Through the hacker_news database, which contains the full dataset (from 2016 to 2025).

To attach the full hacker_news database, you can use the following command:

aws-us-east-1 region only

The hacker_news database is only available for accounts in the aws-us-east-1 region.

ATTACH 'md:_share/hacker_news/de11a0e3-9d68-48d2-ac44-40e07a1d496b' AS hacker_news;

To read from the sample_data database, please refer to attach the sample datasets database

Example queries

Most shared websites

This query returns the top domains being shared on Hacker News.

Example query
SELECT
  regexp_extract(url, 'http[s]?://([^/]+)/', 1) AS domain,
  count(*) AS count
FROM
  sample_data.hn.hacker_news
WHERE
  url IS NOT NULL
  AND regexp_extract(url, 'http[s]?://([^/]+)/', 1) != ''
GROUP BY
  domain
ORDER BY
  count DESC
LIMIT
  20;
SQL Editor loading...
Login to connect

Most commented stories each month

This query calculates the total number of comments for each story and identifies the most commented story of each month.

Example query
WITH
  ranked_stories AS (
    SELECT
      title,
      'https://news.ycombinator.com/item?id=' || id AS hn_url,
      descendants AS nb_comments,
      YEAR(timestamp) AS year,
      MONTH(timestamp) AS month,
      ROW_NUMBER() OVER (
        PARTITION BY
          YEAR(timestamp),
          MONTH(timestamp)
        ORDER BY
          descendants DESC
      ) AS rn
    FROM
      sample_data.hn.hacker_news
    WHERE
      type = 'story'
  )
SELECT
  year,
  month,
  title,
  hn_url,
  nb_comments
FROM
  ranked_stories
WHERE
  rn = 1
ORDER BY
  year,
  month;
SQL Editor loading...
Login to connect

Most monthly voted stories

This query determines the most voted story for each month.

Example query
WITH
  ranked_stories AS (
    SELECT
      title,
      'https://news.ycombinator.com/item?id=' || id AS hn_url,
      score,
      YEAR(timestamp) AS year,
      MONTH(timestamp) AS month,
      ROW_NUMBER() OVER (
        PARTITION BY
          YEAR(timestamp),
          MONTH(timestamp)
        ORDER BY
          score DESC
      ) AS rn
    FROM
      sample_data.hn.hacker_news
    WHERE
      type = 'story'
  )
SELECT
  year,
  month,
  title,
  hn_url,
  score
FROM
  ranked_stories
WHERE
  rn = 1
ORDER BY
  year,
  month;
SQL Editor loading...
Login to connect

Keyword analysis

This query counts the monthly mentions a the keyword (here duckdb) in the title or text of Hacker News posts, organized by year and month.

Example query
SELECT
  YEAR(timestamp) AS year,
  MONTH(timestamp) AS month,
  COUNT(*) AS keyword_mentions
FROM
  sample_data.hn.hacker_news
WHERE
  (
    title LIKE '%duckdb%'
    OR text LIKE '%duckdb%'
  )
GROUP BY
  year,
  month
ORDER BY
  year ASC,
  month ASC;
SQL Editor loading...
Login to connect

Schema

column_namecolumn_typenullkeydefaultextra
titleVARCHARYES
urlVARCHARYES
textVARCHARYES
deadBOOLEANYES
byVARCHARYES
scoreBIGINTYES
timeBIGINTYES
timestampTIMESTAMPYES
typeVARCHARYES
idBIGINTYES
parentBIGINTYES
descendantsBIGINTYES
rankingBIGINTYES
deletedBOOLEANYES