flow-switch.io

flow-switch.io vs Databricks — Spark Benchmark Report

Parquet → Delta Lake · 80 000 784 rows · 5 SAP tables · West Europe
flow-switch.io · Spark 4.0 OSSDatabricks · DBR 17.3Standard_D4ds_v4 · 2 nodes2026-05-12
📊 Results Summary
flow-switch.ioDatabricksDiff
Execution (job running time only — apples-to-apples comparison)
Duration124.6 s149.4 sflow-switch.io 17% faster
VM cost$0.0133$0.0164−19%
DBU cost$0.0157flow-switch.io has no DBU
Total execution cost$0.0133$0.0321flow-switch.io 59% cheaper
Cold start (cluster provision — amortisable overhead)
Duration~120–180 s (est.)412 s (measured)flow-switch.io ~3× faster
Cold start cost~$0.013 (VM only)$0.0862 (DBU + VM)flow-switch.io ~7× cheaper
Total per run (execution + cold start)
Total cost per run~$0.026$0.1183flow-switch.io 78% cheaper
Effective hourly rate (while job is running)
VM cost/hr$0.384$0.384Equal
DBU cost/hr$0.368flow-switch.io has no DBU
Total effective rate$0.384 / hr$0.752 / hrflow-switch.io 49% cheaper/hr
⚙️ Cluster Configuration
flow-switch.io
OrchestratorAKS 1.33.8 · Spark Operator v2
Spark version4.0.0 OSS (apache/spark)
Delta Lakedelta-spark 4.0.0 (OSS)
VM (driver)Standard_D4ds_v4 · 4 vCPU · 16 GiB
VM (executor × 1)Standard_D4ds_v4 · 4 vCPU · 16 GiB
Driver request3 vCPU · 11 GiB (10g + 1g overhead)
Executor request3 vCPU · 11 GiB (10g + 1g overhead)
Node poolsparkbench · min 0 · max 3
AutoscalerCluster Autoscaler (scale from 0)
NetworkAzure CNI Overlay · Cilium
ADLS authSharedKey via k8s Secret
Imagecrprdsphrbotweudev.azurecr.io/spark-bench:4.0.0
Databricks
RuntimeDBR 17.3.x-scala2.13
Spark version4.0 (Databricks)
Delta LakeBuilt-in (Databricks Delta)
Cluster typeJob cluster (kind: CLASSIC_PREVIEW)
Security modeDATA_SECURITY_MODE_DEDICATED
VM (driver)Standard_D4ds_v4 · 4 vCPU · 16 GiB
VM (workers × 1)Standard_D4ds_v4 · 4 vCPU · 16 GiB
DBU rate0.921 DBU/hr (derived from billing)
DBU price$0.40 / DBU (Premium Jobs Compute)
Unity Catalogbench_dev.dbx.* (managed tables)
ADLS authWorkload Identity (UC managed)
Job typePython Wheel Task (DAB)
📈 Cumulative cost vs number of runs

Per run: Databricks $0.1183 total · flow-switch.io ~$0.026 total. At 20 runs Databricks costs $2.37 vs flow-switch.io $0.52 — a $1.85 gap. The savings scale linearly; for high-frequency jobs the gap becomes significant quickly.

⏱ Cost vs job execution duration

Fixed cold start ($0.086 Databricks, ~$0.013 flow-switch.io) is amortised over longer runs. For a 1-hour job, cold start is <2% of Databricks total cost. flow-switch.io stays cheaper across all durations — the gap is structural (no DBU charge).

💰 Effective hourly rate breakdown
flow-switch.io — VMDatabricks — VMDatabricks — DBU

Both platforms use identical VMs ($0.384/hr for 2 × Standard_D4ds_v4). The entire cost gap comes from Databricks's DBU charge ($0.368/hr = 0.921 DBU/hr × $0.40/DBU). DBU rate derived from system.billing.usage: 0.1607 DBU over 628s cluster lifetime.