column
Back to DuckDB Data Engineering Glossary
Definition
A column represents a single field or attribute in a database table or DataFrame that contains values of the same data type. Think of it like a vertical slice in a spreadsheet - every value in that slice represents the same kind of information, like names, dates, or numbers. In a customer database, for example, columns might include first_name, email, and signup_date.
Working with Columns in DuckDB
DuckDB provides powerful ways to work with columns through SQL. You can select specific columns:
SELECT first_name, email FROM customers
Or use wildcards with exclusions:
SELECT * EXCLUDE (password, api_key) FROM users
DuckDB also offers unique column operations not found in most databases, like selecting columns by pattern:
SELECT COLUMNS('order_*') FROM sales
Or applying functions across multiple columns:
SELECT MIN(COLUMNS(*)) FROM metrics
Data Types
Each column must have a consistent data type - for example, a date column can't contain text strings. DuckDB supports standard SQL types like INTEGER, VARCHAR, TIMESTAMP but also modern types like JSON, MAP, and STRUCT for more complex data structures.
Best Practices
Column names should be descriptive and follow a consistent naming convention. While DuckDB is case-insensitive for column names, it's good practice to use lowercase with underscores (snake_case) for readability. Column names should avoid spaces or special characters, though DuckDB allows you to use them if you wrap the name in double quotes.

