StackOverflow Survey Data
Explore the data
Interactive dashboard built on the survey data. Use it as a starting point for your own Dives.
About the dataset
Each year, Stack Overflow conducts a survey of developers to understand the trends in the developer community. The survey covers a wide range of topics, including programming languages, frameworks, databases, and platforms, as well as developer demographics, education, and career satisfaction.
Starting from 2017, StackOverflow provided consistent schema and data format for the survey data, making it a great dataset to analyze trends in the developer community over the years.
The source is data are a series of CSV files that has been merged into a single schema with two tables for easy querying.
How to query the dataset
This dataset is available as part of the sample_data database, which is automatically attached to every MotherDuck account.
Example queries
List the most popular programming languages in 2024
SELECT
LANGUAGE,
COUNT(*) AS count
FROM
(
SELECT
UNNEST (STRING_SPLIT (LanguageHaveWorkedWith, ';')) AS LANGUAGE
FROM
sample_data.stackoverflow_survey.survey_results
WHERE
YEAR = '2024'
) AS languages
GROUP BY
LANGUAGE
ORDER BY
count DESC;Top 10 countries with the most respondents in 2024
SELECT Country, COUNT(*) AS Respondents FROM sample_data.stackoverflow_survey.survey_results WHERE YEAR = '2024' GROUP BY Country ORDER BY Respondents DESC LIMIT 10;
Correlation between remote work and job satisfaction in 2024
SELECT
RemoteWork,
AVG(CAST(JobSat AS DOUBLE)) AS AvgJobSatisfaction,
COUNT(*) AS RespondentCount
FROM
sample_data.stackoverflow_survey.survey_results
WHERE
JobSat NOT IN (
'NA',
'Slightly satisfied',
'Neither satisfied nor dissatisfied',
'Very dissatisfied',
'Very satisfied',
'Slightly dissatisfied'
)
AND RemoteWork NOT IN ('NA')
AND YEAR = '2024'
GROUP BY ALLSchema
stackoverflow_survey.survey_results
This table contains all the survey results from 2017 to 2024. Each column represents a question from the survey. As questions change from year to year, the columns may vary a bit and the table is quite large.
stackoverflow_survey.survey_schema
This table contains the schema of the survey results. qname is the name of the question, which is also the column name in the survey_results table. question is the full question text.
| Column Name | Column Type |
|---|---|
| qname | VARCHAR |
| question | VARCHAR |
| qid | VARCHAR |
| force_resp | VARCHAR |
| type | VARCHAR |
| selector | VARCHAR |
| year | VARCHAR |