CLI Guide¶
DRLS provides a Click-based CLI for all common operations.
Installation¶
Commands¶
drls init¶
Initialize catalog configuration.
drls init --catalog hadoop --warehouse /tmp/warehouse
drls init --catalog rest --uri https://catalog.example.com --warehouse s3://bucket/warehouse
| Flag | Required | Description |
|---|---|---|
--catalog |
Yes | Catalog type: hadoop, hive, rest, polaris, glue, nessie |
--warehouse |
Yes | Warehouse path |
--uri |
No | Catalog URI (required for hive/rest/polaris/nessie) |
--name |
No | Catalog name (default: drls) |
drls tables¶
List all tables in the catalog.
drls health <table>¶
Show health report for a table.
drls compact <table>¶
Run compaction on a table.
drls compact drls.db.events --strategy binpack
drls compact drls.db.events --strategy sort
drls compact drls.db.events --strategy zorder
| Flag | Default | Description |
|---|---|---|
--strategy |
binpack |
Compaction strategy: binpack, sort, zorder |
drls expire <table>¶
Expire old snapshots.
| Flag | Default | Description |
|---|---|---|
--retain |
5 |
Number of snapshots to retain |
--older-than |
None | Expire snapshots older than (ISO8601) |
drls remove-orphans <table>¶
Remove orphan files.
| Flag | Default | Description |
|---|---|---|
--dry-run |
false |
List files without removing |
drls history <table>¶
Show snapshot history.
drls stream <source_table>¶
Start a streaming bridge that reads from an Iceberg source table using Spark Structured Streaming. Optionally writes to a sink Iceberg table.
# Stream from source to sink with a checkpoint
drls stream drls.db.cdc_events \
--sink-table drls.db.processed_events \
--checkpoint s3://bucket/checkpoints/cdc \
--trigger-interval "10 seconds"
# Stream with a batch limit (useful for testing or bounded jobs)
drls stream drls.db.cdc_events --max-batches 50
# Monitor-only mode (no sink — counts rows and discards)
drls stream drls.db.events --trigger-interval "5 seconds"
| Flag | Default | Description |
|---|---|---|
--sink-table |
None | Sink Iceberg table for writes |
--checkpoint |
None | Checkpoint location for streaming state |
--trigger-interval |
10 seconds |
Processing time trigger interval |
--max-batches |
0 (unlimited) |
Maximum batches to process |
Progress is reported as line-delimited JSON to stdout:
{"status": "running", "batches_processed": 1, "rows_processed": 500}
{"status": "running", "batches_processed": 2, "rows_processed": 1050}
{"status": "completed", "batches_processed": 2, "rows_processed": 1050}
drls grpc-codegen¶
Regenerate Python protobuf/gRPC stubs from lakehouse.proto.
drls grpc-serve¶
Start the gRPC server (optional, for external consumers).
| Flag | Default | Description |
|---|---|---|
--host |
127.0.0.1 |
Bind host |
--port |
50051 |
Bind port |
drls mcp-server¶
Start the MCP server for agentic tool access.
| Flag | Default | Description |
|---|---|---|
--host |
127.0.0.1 |
Bind host |
--port |
8100 |
Bind port |
drls agent <prompt>¶
Run a natural language command via TRex.
drls agent "How healthy is the events table?" --provider ollama --model llama3:70b
drls agent "Compact all tables" --provider openai --model gpt-4o
| Flag | Default | Description |
|---|---|---|
--provider |
ollama |
LLM provider (ollama, openai, anthropic, etc.) |
--model |
llama3:70b |
Model name |
Common Flags¶
All table operation commands (tables, health, compact, expire, remove-orphans, history) accept:
| Flag | Description |
|---|---|
--catalog |
Iceberg catalog type |
--warehouse |
Warehouse path |
--catalog-uri |
Catalog URI |
Output Format¶
All commands output JSON for machine consumption: