SQL query
Back to DuckDB Data Engineering Glossary
Definition
A SQL query is a structured request written in Structured Query Language (SQL) that allows you to retrieve, analyze, or manipulate data stored in a database. The most common type of query uses the SELECT
statement to read data, though queries can also modify data using statements like INSERT
, UPDATE
, and DELETE
.
Basic Structure
A typical SQL query starts with SELECT
followed by the columns you want to retrieve, then FROM
to specify the source table(s), and optionally WHERE
to filter results. Additional clauses like GROUP BY
, HAVING
, and ORDER BY
help shape and organize the output.
DuckDB Examples
Here's a simple query that selects all columns from a table:
Copy code
SELECT * FROM employees;
A more complex query showing filtering, grouping and ordering:
Copy code
SELECT department,
COUNT(*) as employee_count,
AVG(salary) as avg_salary
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY avg_salary DESC;
DuckDB extends standard SQL syntax by allowing you to omit the SELECT
keyword when using *
, making this valid:
Copy code
FROM employees;
You can also use DuckDB's unique column selection syntax to dynamically select columns:
Copy code
SELECT COLUMNS('salary_.*')
FROM employees;
Common Use Cases
Data analysts and engineers use SQL queries to perform tasks like generating reports, analyzing trends, cleaning data, and integrating different data sources. Queries can range from simple one-line statements to complex operations involving multiple joins, subqueries, and window functions.
DuckDB-Specific Features
DuckDB optimizes queries for analytical workloads and supports querying data directly from files like CSV and Parquet without first loading them into tables. It also includes extensions to standard SQL that make common analytical tasks more convenient, such as simplified grouping syntax and enhanced column selection capabilities.