*New* The MotherDuck Native Integration is Live on Vercel Marketplace for Embedded Analytics and Data AppsLearn more

schema

Back to DuckDB Data Engineering Glossary

Definition

A schema is a logical container or blueprint that defines how data is organized within a database. It specifies the structure of tables, the data types of columns, and the relationships between different tables. Think of a schema like the architectural plans for a house - it shows where everything goes and how the pieces fit together, but doesn't contain the actual data (furniture) itself.

Usage in DuckDB

In DuckDB, schemas provide a way to organize database objects like tables and views into namespaces. The default schema is called main. Unlike some other databases that require explicit schema creation, DuckDB automatically creates the main schema and allows you to reference objects within it directly.

You can create additional schemas using:

Copy code

CREATE SCHEMA marketing;

And reference objects within a schema using dot notation:

Copy code

CREATE TABLE marketing.campaigns ( id INTEGER, name VARCHAR );

Key Differences

While enterprise databases like PostgreSQL use schemas extensively for access control and multi-tenant applications, DuckDB's implementation is primarily focused on logical organization. DuckDB doesn't implement schema-level security since it's an embedded database without user management. Additionally, DuckDB allows you to query across schemas without explicit cross-schema permissions that might be required in other databases.

Best Practices

For smaller projects, using the default main schema is perfectly acceptable. However, as your data model grows, using multiple schemas can help organize related tables and views together. For example, you might have separate schemas for raw data ingestion, transformed data models, and reporting views. This organization makes it easier to understand and maintain your database structure over time.