Architecture Overview¶
DRLS is a multi-runtime system combining Spark-on-Ray execution, Iceberg lakehouse operations, and an AI agentic layer, with a React management console backed by a Rust server.
System Architecture¶
graph TB
subgraph "User Interfaces"
CLI["CLI (Click)"]
React["React UI :5173"]
MCP["MCP Server :8100"]
end
subgraph "Backend Services"
Axum["Axum REST :3000"]
PyO3["PyO3 (embedded Python)"]
SQLite["SQLite (pipelines, config,\nversions, approvals)"]
end
subgraph "Python Runtime"
Agent["TRex"]
Core["Core Runtime"]
end
subgraph "Execution"
Spark["Spark-on-Ray"]
Ray["Ray Cluster"]
Iceberg["Iceberg Tables"]
end
CLI --> Core
React --> Axum
Axum --> PyO3
PyO3 --> Core
PyO3 --> Agent
Axum --> SQLite
MCP --> Agent
Agent --> Core
Core --> Spark
Spark --> Ray
Spark --> Iceberg
Tech Stack¶
| Component | Technology | Version |
|---|---|---|
| JVM Runtime | Java | 21 |
| Distributed SQL | Apache Spark | 4.1.1 |
| Scala | Scala | 2.13.17 |
| Distributed Compute | Ray | 2.53.0 (Python) / 2.47.1 (Java JARs) |
| Python | Python | 3.12+ |
| Table Format | Apache Iceberg | via Spark 4.1.1 |
| REST Backend | Rust (Axum) | 0.8 |
| Python Embedding | PyO3 | 0.23 |
| Database | SQLite (SQLx) | — |
| Frontend | React 18 + Vite 6 + TypeScript 5.6 | — |
| AI Integration | litellm + MCP | — |
Data Flow¶
The primary data flow for the management console:
For CLI and Python API usage:
Module Overview¶
| Module | Location | Purpose |
|---|---|---|
| JVM Core | core/ |
Spark shim, app master, executor, object store |
| Python Package | python/drls/ |
Core runtime, streaming, Iceberg ops, agentic, CLI |
| Rust UI Server | rust/crates/ui-server/ |
Axum REST server, PyO3 bridge, SQLite, middleware, pipeline versioning, approval workflow |
| Shared CRDs | rust/crates/shared/ |
Pipeline and Notebook CRD types shared across operators |
| Pipeline Operator | rust/crates/pipeline-operator/ |
K8s operator — orchestrates DrlsPipeline and DrlsPipelineRun CRDs |
| Marimo Operator | rust/crates/marimo-operator/ |
K8s operator — manages MarimoNotebook pod lifecycle |
| React Frontend | ui/frontend/ |
Management console UI, pipeline builder, approval queue |
| Examples | examples/ |
Demo scripts |
Note
ui/backend/ is a symlink to rust/crates/ui-server/ for dev workflow compatibility.
Pipeline Lifecycle¶
Pipeline definitions are versioned, immutable artifacts that progress through a managed environment lifecycle:
graph LR
dev["Development"] --> qa["QA"]
qa -->|"approval"| uat["UAT"]
uat -->|"approval"| prod["Production"]
prod --> retired["Retired"]
uat -.->|"demote"| qa
- Versioning — Each save creates an immutable version snapshot with a format version for compatibility
- Environments — Versions move through development → QA → UAT → production → retired
- Approval gates — Promotions to UAT and production require Lead Data Engineer approval
- Read-only enforcement — UAT, production, and retired versions cannot be modified
- Import/Export — Pipeline versions can be exported as JSON and imported into other environments
Pipeline Execution¶
When a user deploys a pipeline from the builder, the full execution path is:
graph LR
Builder["Pipeline Builder"] --> Axum["Axum (graph compiler)"]
Axum --> CRD["DrlsPipeline CR\n+ DrlsPipelineRun CR"]
CRD --> Op["Pipeline Operator\n(DAG evaluation)"]
Op --> Child["SparkApplication /\nRayJob / Job CRs"]
Child --> Runners["Spark Operator /\nKubeRay / K8s"]
Runners --> Pods["Pods"]
Pods -.-> Op
Op -.->|"status update"| CRD
CRD -.->|"CRD read"| Axum
Axum -.->|"step-level status"| Builder
- Graph compilation — Axum translates React Flow nodes/edges into
StepSpec[]entries in aDrlsPipelineSpec - CRD creation — Axum creates
DrlsPipelineandDrlsPipelineRuncustom resources in the pipeline namespace - DAG evaluation — The pipeline operator watches runs, evaluates step dependencies (Kahn's algorithm), and creates child CRs for ready steps
- Child CR dispatch — Each step type maps to a child resource:
sparkStream→ SparkApplication,rayProcess→ RayJob,dbt/custom→ Job - Status reconciliation — The operator monitors child CR status (via
DynamicObjectfor SparkApplication/RayJob), updates step phases, handles retries - Frontend display — The pipeline status endpoint reads CRD status and returns step-level progress to the UI
See the detailed architecture pages for each component: