DuckDB.
DuckDB is an in-process analytic database — like SQLite, but tuned for analytical queries on columnar data. Orchid opens any .duckdb file, and you can query Parquet, CSV, and JSON files directly without loading them first.
This connector is on the v1.1 roadmap. The setup steps below are the planned flow.
Requirements
- A
.duckdbfile on your machine, or just the files (Parquet/CSV/JSON) you want to query — DuckDB doesn't need a database file at all to read external files. - Read permission on the file(s).
Connect
- Open Integrations → + Add connection → DuckDB.
- Either:
- Click Browse and pick a
.duckdbfile, or - Leave the path empty for an in-memory database (queries on Parquet/CSV files work directly).
- Click Browse and pick a
- Click Save. The schema appears immediately — no test step required.
DuckDB connections start read-only. Flip Allow writes to enable mutations (CREATE TABLE, INSERT, etc.). Reading Parquet/CSV files doesn't require write access — only writes to the database file do.
Querying files directly
DuckDB's killer feature for an analyst's notebook: read external files with no load step. Use the file path right in the FROM clause.
-- Query a single Parquet file
SELECT user_id, count(*) AS events
FROM '/Users/me/data/events-2026-05.parquet'
GROUP BY user_id
ORDER BY events DESC
LIMIT 100;
-- Glob across an entire directory of Parquet files
SELECT date_trunc('day', event_ts) AS day,
count(*) AS events
FROM '/Users/me/data/events-*.parquet'
WHERE event_ts > now() - interval 30 day
GROUP BY 1
ORDER BY 1;
-- CSV with options
SELECT *
FROM read_csv_auto('/Users/me/data/customers.csv',
header = true,
sample_size = -1);
-- JSON
SELECT name, address.city AS city
FROM read_json_auto('/Users/me/data/customers.ndjson');
Optional settings
S3 / cloud storage
DuckDB can query Parquet directly from S3, GCS, or any S3-compatible store. Configure credentials inside a cell once per session.
-- One-time per session: install + load the httpfs extension
INSTALL httpfs;
LOAD httpfs;
-- Configure S3 credentials
SET s3_region = 'us-east-1';
SET s3_access_key_id = '...';
SET s3_secret_access_key = '...';
-- Now query S3 directly
SELECT count(*) FROM 's3://my-bucket/events/2026/05/*.parquet';
Persistent vs in-memory
Pointing the connection at a .duckdb file gives you a persistent database — any tables you create stay. Leaving the path empty starts an in-memory session that resets every time you restart Orchid. In-memory is perfect for ad-hoc work on external Parquet files where the database itself is just a query engine.
Common gotchas
- "file does not exist" — paths in SQL strings are case-sensitive on macOS and Linux. Check the literal path on disk.
- "HTTP 403 / SignatureDoesNotMatch" when reading from S3 — credentials wrong or region mismatch. Set
s3_regionto match the bucket's region. - Mismatched Parquet schemas in a glob — if your Parquet files have evolving schemas, DuckDB throws on schema mismatch. Use
union_by_name = truein theparquet_scanoptions to align by column name. - Slow queries on a single large CSV — convert to Parquet once for big speedups:
COPY (SELECT * FROM read_csv_auto('big.csv')) TO 'big.parquet' (FORMAT PARQUET); - Database file locked — DuckDB allows a single writer process. If your local app has the file open in write mode, Orchid's connection will fail. Close the other process.
Example queries
-- List tables in the current database
SELECT table_name FROM information_schema.tables WHERE table_schema = 'main';
-- Pivot Parquet data into a daily summary
SELECT
date_trunc('day', event_ts) AS day,
count(*) FILTER (WHERE event = 'signup') AS signups,
count(*) FILTER (WHERE event = 'login') AS logins,
count(DISTINCT user_id) AS daily_active_users
FROM '/Users/me/data/events-2026-*.parquet'
GROUP BY 1
ORDER BY 1;
Where to go next
For more on writing SQL cells, see SQL cells.