Porter
A streaming-first Arrow server for DuckDB β Flight SQL and WebSocket, simple and built for motion.
π§ Overview
Porter is a DuckDB-backed Arrow server with two transport protocols:
- Flight SQL β gRPC-based Arrow Flight SQL
- WebSocket β HTTP-based Arrow streaming
SQL goes in. Arrow streams out. Everything else is detail.
Both transports share the same execution engine, ensuring identical query semantics.
Summary Benchmark Results
| Metric |
WebSocket |
FlightSQL (gRPC) |
| Ops |
12 |
12 |
| Success |
12 |
12 |
| Errors |
0 |
0 |
| Rows/sec |
130,712,427 |
121,704,008 |
| Throughput |
1014.32 MB/s |
928.53 MB/s |
| Latency p50 |
26 ms |
17 ms |
| Latency p95 |
41 ms |
60 ms |
| Latency p99 |
41 ms |
60 ms |
See the Benchmark Report for details.
β‘ Key Characteristics
- Streaming-first execution model (Arrow RecordBatch streams)
- Dual transport support: Flight SQL + WebSocket
- Bulk Ingest β Arrow RecordBatch β DuckDB with transactional semantics
- Shared execution engine for semantic parity
- Native DuckDB execution via ADBC
- Full prepared statement lifecycle with parameter binding
- TTL-based handle management with background GC
- Live status surface with pipeline flow, pressure, and backpressure visibility
ποΈ Architecture
+-------------------+
| Flight Client | <-- ADBC / Flight SQL
+-------------------+
|
gRPC / Flight
|
+-------------------+
| Porter Server |
|-------------------|
| Shared Engine | <-- BuildStream()
+-------------------+
|
+-------------------+
| DuckDB |
| (via ADBC) |
+-------------------+
|
+-------------------+
| Arrow RecordBatches|
+-------------------+
The server is intentionally thin: routing, lifecycle, and streaming glue only.
DuckDB does the heavy lifting.
π Getting Started
You have three ways to run Porter:
- Docker (fastest path)
go install (clean local toolchain)
- Build from source (full control)
π³ Option 1 β Run with Docker
docker build -t porter .
docker run -p 32010:32010 -p 8080:8080 porter --ws
Run with a persistent database:
docker run -p 32010:32010 -p 8080:8080 -v $(pwd)/data:/data porter --db /data/porter.duckdb --ws
Defaults:
- Flight SQL:
0.0.0.0:32010
- WebSocket:
0.0.0.0:8080 (when --ws enabled)
- Status:
0.0.0.0:9091 (enabled by default)
- Database: in-memory (
:memory:)
Prerequisites
Install dbc and required ADBC drivers:
curl -LsSf https://dbc.columnar.tech/install.sh | sh
dbc install duckdb
dbc install flightsql
βοΈ Option 2 β Install via go install
1. Install Porter
go install github.com/TFMV/porter/cmd/porter@latest
This installs porter into your $GOBIN.
π Option 3 β Build from Source
1. Clone
git clone https://github.com/TFMV/porter.git
cd porter
2. Run
go run ./cmd/porter serve
π» CLI Usage
porter --help
Quick Start
porter # Start Flight SQL server on :32010
porter serve # Same as above
With WebSocket
porter --ws # Flight SQL + WebSocket
porter serve --ws # Same as above
porter serve --ws --ws-port 9090 # Custom WebSocket port
porter serve --status-port 9191 # Custom status surface
Full Flags
| Flag |
Description |
Default |
--db |
DuckDB file path |
:memory: |
--port |
Flight SQL port |
32010 |
--ws |
Enable WebSocket |
false |
--ws-port |
WebSocket port |
8080 |
--status |
Enable live status surface |
true |
--status-port |
Status server port |
9091 |
Execute a query
porter query "SELECT 1 AS value"
REPL
porter repl
Load Parquet
porter load data.parquet
Inspect schema
porter schema table_name
Environment variables
PORTER_DB
PORTER_PORT
PORTER_WS
PORTER_WS_PORT
PORTER_STATUS
PORTER_STATUS_PORT
Live Status Surface
Porter now exposes a dedicated status server with a living cross-section of the pipeline:
/status β live instrument panel UI
/status/live β current JSON snapshot
/status/stream β SSE stream of snapshots
/status/history β rolling snapshot history
/status/health β deterministic health status
The flow view tracks:
ingress -> transport -> execution -> egress
- rows/sec and MB/sec per stage
- queue depth and pressure buildup
- p50/p95/p99 latency divergence
- live structured activity feed
- WebSocket vs FlightSQL vs ingest path comparison

π Wire Contract
Flight SQL
| Operation |
Behavior |
| SQL Query |
Raw SQL β FlightInfo β DoGet stream |
| Prepared Statements |
Handle-based execution with binding |
| Schema Introspection |
Lightweight probe execution |
| ExecuteUpdate |
DDL/DML via DoPutCommandStatementUpdate |
WebSocket
Send JSON query request:
{"query": "SELECT * FROM table"}
Receive:
- Schema message:
{"type": "schema", "fields": ["col1", "col2"]}
- Binary IPC frames containing Arrow RecordBatches
π₯ Bulk Ingest
Porter supports high-throughput Arrow RecordBatch ingestion via Flight SQL's DoPut:
// Engine interface
IngestStream(ctx, table, reader, opts) (int64, error)
Features:
| Feature |
Description |
| Transactional |
One stream = one DB transaction |
| Schema validation |
Incoming Arrow schema must match target table |
| Backpressure |
Configurable MaxUncommittedBytes (default 64MB) |
| Table locking |
Per-table mutex prevents concurrent writes to same table |
| Auto-commit |
Automatically commits on successful ingest, rolls back on failure |
IngestOptions:
| Option |
Description |
Catalog |
Target catalog name |
DBSchema |
Target schema name |
Temporary |
Create as temporary table |
IngestMode |
Append, replace, or create |
MaxUncommittedBytes |
Memory limit before fail-fast (default 64MB) |
Flow:
Client β DoPut (Arrow RecordBatch stream) β Engine.IngestStream β SegmentWriter β Commit β DuckDB
The SegmentWriter accumulates RecordBatches in memory, then atomically publishes them on commit. If MaxUncommittedBytes is exceeded, ingestion fails fast with rollback.
π Streaming Core
Both transports use the same execution primitive:
BuildStream(ctx, sql, params) (*arrow.Schema, <-chan StreamChunk, error)
DuckDB β Arrow RecordReader β Channel β StreamChunk
Backpressure is enforced naturally via the channel boundary.
π£οΈ Roadmap
- Streaming Flight SQL execution
- WebSocket transport
- Shared execution engine
- Bulk Ingest (DoPut)
- Prepared statements
- TTL-based lifecycle
- Background GC
- Session context
- Improved schema probing
- Benchmark suite
π€ Contributing
If you've ever looked at a data system and thought:
"Why is this so complicated?"
You're in the right place.
Build it smaller. Make it clearer. Keep it moving.