Athena.

AWS Athena runs SQL over data sitting in S3 — Parquet, ORC, JSON, CSV. Orchid connects to Athena through a workgroup, runs queries against your Glue Data Catalog, and streams results back into a notebook.

Coming soon

This connector is on the v1.1 roadmap. The setup steps below are the planned flow.

Requirements

An AWS account with Athena set up and at least one workgroup.
An S3 bucket configured as the workgroup's query result location.
An IAM principal (user or role) with Athena, Glue, and S3 permissions.
A Glue Data Catalog database containing the tables you want to query.

IAM setup

The IAM principal Orchid uses needs the following permissions. A managed policy that covers this is AmazonAthenaFullAccess, but a tighter custom policy is better. At minimum:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "athena:StartQueryExecution",
        "athena:GetQueryExecution",
        "athena:GetQueryResults",
        "athena:GetQueryResultsStream",
        "athena:StopQueryExecution",
        "athena:ListWorkGroups",
        "athena:GetWorkGroup"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetTables",
        "glue:GetPartitions"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-data-bucket",
        "arn:aws:s3:::your-data-bucket/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::your-athena-results",
        "arn:aws:s3:::your-athena-results/*"
      ]
    }
  ]
}

Two distinct S3 ARNs: one for the data you're querying (read-only is enough), one for the query result location (read + write — Athena writes result files there).

Connect

Open Integrations → + Add connection → Athena.
Fill in:
- Region — e.g. us-east-1
- Workgroup — typically primary; pick the one whose result location your IAM principal can write to
- Database — the Glue catalog database (e.g. analytics)
- Output location — S3 URI (e.g. s3://your-athena-results/orchid/). Orchid auto-fills this from the workgroup if it's set there; otherwise paste explicitly.
- Auth — Access key + secret, AWS profile, or SSO
Click Test connection → save.

The Athena connection form with region, workgroup, database, output location, and auth method./docs-images/connectors/athena-form.png

The Athena connection form. The output location S3 bucket must be writable by your IAM principal.

Authentication methods

Access key + secret

Simplest. Paste the access key ID and secret access key. Stored in your OS keychain. Use a dedicated IAM user for Orchid, not your root account.

AWS profile (~/.aws/credentials)

If you have ~/.aws/credentials set up, pick Use AWS profile and provide the profile name. Orchid loads credentials from your local AWS config — same identity you use with the AWS CLI.

SSO / IAM Identity Center

For organizations on AWS SSO, run aws sso login --profile your-profile in your shell, then point Orchid at that profile.

Optional settings

Engine version

Athena supports SQL engine v2 (Trino-based) and v3 (newer Trino). The workgroup determines this. v3 is recommended for new workgroups — wider function coverage and better performance.

Result encryption

If your workgroup enforces result encryption (SSE-S3, SSE-KMS, or CSE-KMS), Orchid respects it transparently. KMS-encrypted results require kms:Decryptpermission on the principal.

Cost discipline

Athena charges $5 per TB scanned (on-demand). Orchid shows the bytes-scanned for each query in the result panel. Always filter on partition columns, and prune columns with explicit SELECT a, b rather than SELECT *.

Common gotchas

"Insufficient permissions to execute the query" — your IAM principal can't write to the result location, or can't read from the data S3 bucket. Check both bucket ARNs in the policy.
"HIVE_PARTITION_SCHEMA_MISMATCH" — Glue catalog and underlying Parquet files disagree on schema (column added, type changed). Run MSCK REPAIR TABLE your_table; or recreate the table.
Query exceeds Athena query timeout (30 minutes) — refactor to scan less data (partition filters, column pruning), or split into multiple queries.
Results truncated — Athena's result stream is paginated. Orchid streams the full result back; if you see truncation it's usually a workgroup setting capping result size.
"ResourceNotFoundException: Database not found" — the database name is wrong, or it lives in a different Glue catalog (not the default one). Athena uses AwsDataCatalog by default.
Region mismatch — Athena queries data in the same region as the workgroup. Cross-region S3 reads work but cost more and are slower.

Example queries

-- List databases in the default catalog
SHOW DATABASES;

-- Daily event count with partition filter
SELECT date_trunc('day', from_unixtime(event_time)) AS day,
       count(*) AS events
FROM events
WHERE year = 2026 AND month = 5
GROUP BY 1
ORDER BY 1;

-- Top S3 access log paths today (CloudFront logs example)
SELECT request_uri, count(*) AS hits
FROM cloudfront_logs
WHERE date = current_date
GROUP BY request_uri
ORDER BY hits DESC
LIMIT 50;

Where to go next

For more on writing SQL cells, see SQL cells.