column
Back to DuckDB Data Engineering Glossary
Definition
A column represents a single field or attribute in a database table or DataFrame that contains values of the same data type. Think of it like a vertical slice in a spreadsheet - every value in that slice represents the same kind of information, like names, dates, or numbers. In a customer database, for example, columns might include first_name
, email
, and signup_date
.
Working with Columns in DuckDB
DuckDB provides powerful ways to work with columns through SQL. You can select specific columns:
SELECT first_name, email FROM customers
Or use wildcards with exclusions:
SELECT * EXCLUDE (password, api_key) FROM users
DuckDB also offers unique column operations not found in most databases, like selecting columns by pattern:
SELECT COLUMNS('order_*') FROM sales
Or applying functions across multiple columns:
SELECT MIN(COLUMNS(*)) FROM metrics
Data Types
Each column must have a consistent data type - for example, a date column can't contain text strings. DuckDB supports standard SQL types like INTEGER
, VARCHAR
, TIMESTAMP
but also modern types like JSON
, MAP
, and STRUCT
for more complex data structures.
Best Practices
Column names should be descriptive and follow a consistent naming convention. While DuckDB is case-insensitive for column names, it's good practice to use lowercase with underscores (snake_case) for readability. Column names should avoid spaces or special characters, though DuckDB allows you to use them if you wrap the name in double quotes.