Guides

Lakehouse & catalog

Your tables are Apache Iceberg on your own bucket, open, ACID, versioned. Polnor catalogues them through Lakekeeper, an Iceberg REST catalog, and addresses them as catalog.namespace.table.

The naming model

A fully-qualified table is three parts: warehouse.namespace.table, for example acme.sales.orders. The catalog maps namespaces to prefixes on your S3 bucket; the actual Parquet and metadata files live there, under your control.

Create a database (namespace)

polnor catalog create-namespace acme.sales

Create a table

Define a table with a schema. Writes are executed by the Spark sidecar.

polnor sql "CREATE TABLE acme.sales.orders (
    id BIGINT, customer STRING, amount DECIMAL(10,2), ts TIMESTAMP
 ) USING iceberg" --warehouse analytics

Read and write

Polnor routes SQL by type: reads go to DuckDB (fast, via iceberg_scan), writes go to Spark.

# read, DuckDB
polnor sql "SELECT customer, sum(amount) FROM acme.sales.orders GROUP BY 1"

# write, Spark
polnor sql "INSERT INTO acme.sales.orders VALUES (1,'acme',42.00,now())"

DuckDB only reads Iceberg. Any INSERT/UPDATE/MERGE/CREATE is dispatched to the Spark sidecar automatically, you don't pick the engine.

Time travel & maintenance

Iceberg keeps snapshots, so you can query history and roll back. The Spark sidecar also handles maintenance, compaction and snapshot expiry, to keep tables fast and storage tidy.

SELECT * FROM acme.sales.orders FOR VERSION AS OF 3821..;

No lock-in

It's plain Iceberg on your bucket. Point any other Iceberg-aware engine at the same catalog, or take the files and leave, the format is yours.