Benchmark Guide: ClickHouse Performance on Commodity SSDs vs PLC Flash
DatabasesBenchmarkingStorage

Benchmark Guide: ClickHouse Performance on Commodity SSDs vs PLC Flash

uuntied
2026-02-14
11 min read
Advertisement

Reproducible ClickHouse benchmarks for NVMe vs PLC flash with scripts and cost-per-query analysis—practical guidance for analytics teams in 2026.

Hook: Why your analytics pipeline's storage choice is costing you time and money

If your ClickHouse cluster is feeling slow, brittle, or expensive to scale, the storage layer is a common culprit. Teams face two recurring pain points: unpredictable query tail latency under mixed loads, and runaway TCO when SSD prices spike. In 2026, those problems are getting louder — new high-density PLC flash promises far lower $/GB than mainstream NVMe), but with tradeoffs in endurance and latency. This guide shows you how to design, run, and publish reproducible benchmarks that compare ClickHouse performance on commodity NVMe drives versus emerging PLC flash — complete with scripts, measurement methodology, and a practical cost-per-query model your analytics team can adopt.

Executive summary — what you'll get and why it matters (most important first)

  • Reproducible benchmark design: hardware/software matrix, workload selection, measurement points.
  • Run-ready scripts: fio, nvme-cli, clickhouse-benchmark, synthetic data generation, and collection automation.
  • Interpretation and cost-per-query analysis: calculate when PLC makes economic sense for OLAP workloads.
  • Recommendations and migration checklist: when to adopt PLC vs keep NVMe and how to reduce risk.

Context in 2026 — why this comparison matters now

Late 2025 into 2026 has been a turning point: SSD suppliers like SK Hynix advanced PLC techniques (cell splitting and controller optimization) that make 5-bit and similar high-density flash more viable, and ClickHouse has continued rapid enterprise adoption (significant funding and ecosystem growth). That combination means storage choices are a strategic decision for analytics platforms, not just a hardware procurement detail.

PLC offers lower $/GB and higher capacities — attractive for large cold and warm OLAP clusters — but introduces new failure modes, endurance limits, and latency characteristics. ClickHouse is read-dominant, which favors PLC in many scenarios, but mixed ingestion workloads or replay-heavy pipelines can punish PLC endurance and write amplification.

Benchmark design principles (reproducibility first)

Good benchmarks are precise about scope, minimize confounding variables, and provide artifacts so others can reproduce results. Follow these principles:

  • Isolate storage: run on identical servers with only the NVMe/PLC device swapped.
  • Fix software: same kernel, same ClickHouse version (use a Docker image tag or package version), same ClickHouse config except for storage paths if necessary.
  • Measure a matrix of workloads: sequential read/write, random read/write, mixed OLAP queries (large aggregations), ingestion throughput, and tail latency under concurrency.
  • Include endurance and sustained tests: PLC will show different behavior after sustained writes; include longer runs and monitor SMART/health metrics.
  • Automate collection: store fio, iostat, nvme-cli, and ClickHouse metrics (system tables and metrics.prom) in a reproducible artifact (CSV/JSON) and upload to a repo.

Hardware and software test matrix (example)

Use this as a baseline. Replace models with the specific NVMe and PLC drives you have. Keep everything else identical.

  • Server: 2 x Intel Xeon Gold or AMD EPYC, 256 GB RAM, identical BIOS and kernel (Ubuntu 22.04/24.04 with tuned kernel 6.x).
  • Network: isolated 25GbE for control and data (if distributed), or run standalone on each server.
  • Drive A (NVMe): commodity 3D TLC NVMe, e.g., 2 TB class.
  • Drive B (PLC): emerging PLC NVMe (prototype/early production) — same capacity class as Drive A.
  • ClickHouse: version pinned (example: clickhouse-server 24.x/25.x — use a release available in 2026), same config template with only storage path changed.

Workloads to run

Your workloads should map to real team pain points: heavy aggregation queries, ad-hoc SELECTs, streaming ingestion. Here are the minimum tests.

  1. fio microbenchmarks — characterize raw device behavior (sequential read/write, random 4K read/write, mixed 70/30 r/w). Run before ClickHouse setup to baseline device.
  2. ClickHouse OLAP queries — use a TPC-H or TPC-H-like dataset scaled to disk (e.g., 1 TB to 5 TB) and run a mix of heavy aggregations and point lookups. Measure average and tail latency (P50, P95, P99).
  3. Concurrent ingestion + reads — simulate real-world: concurrent insert pipeline (Kafka engine or batch inserts) while running analytical queries to exercise write/read interference.
  4. Sustained write endurance stress — long-duration (24–72 hour) writes to see performance drift and controller garbage-collection behavior on PLC.

Run-ready scripts and commands

Below are concise, copy/paste-friendly snippets. Wrap them in automation (Ansible, bash, or GitHub Actions) for reproducibility. Place all outputs in a results directory for publishing.

1) Device baseline: fio microbench

fio --name=seqread --filename=/dev/nvme0n1 --rw=read --bs=1M --size=100G --numjobs=1 --time_based --runtime=300 --group_reporting --ioengine=libaio

fio --name=randrw --filename=/dev/nvme0n1 --rw=randrw --rwmixread=70 --bs=4k --size=200G --numjobs=16 --time_based --runtime=300 --group_reporting --ioengine=libaio

2) Collect device health and config

nvme id-ctrl /dev/nvme0n1 > nvme_id_ctrl.json
nvme smart-log /dev/nvme0n1 > nvme_smart.log
lsblk -o NAME,SIZE,ROTA,MOUNTPOINT > lsblk.txt
uname -a > kernel.txt

3) ClickHouse data generation (sample table & insert)

Generate a CSV-sized dataset using ClickHouse's numbers engine and insert into a MergeTree table sized to fill the device.

clickhouse-client --query=
"CREATE TABLE benchmark.events (
  event_date Date,
  user_id UInt64,
  event_type String,
  value Float64
) ENGINE = MergeTree(event_date, (user_id), 8192);

INSERT INTO benchmark.events SELECT
  toDate('2026-01-01') + intDiv(number,1000000) AS event_date,
  number AS user_id,
  arrayElement(['click','view','purchase'], rand() % 3 + 1) AS event_type,
  rand()%1000/10.0 AS value
FROM numbers(500000000);"

4) ClickHouse query benchmarks (clickhouse-benchmark)

clickhouse-benchmark -q "SELECT event_type, count() FROM benchmark.events WHERE event_date BETWEEN '2026-01-01' AND '2026-01-30' GROUP BY event_type" -r 100 -c 16 -t 60 > query_agg_results.txt

clickhouse-benchmark -q "SELECT count() FROM benchmark.events WHERE user_id = 123456789" -r 100000 -c 250 -t 120 > query_point_results.txt

5) Concurrent ingestion + queries

# Start ingestion in background (batch inserts)
for i in {1..24}; do
  clickhouse-client --query="INSERT INTO benchmark.events SELECT toDate('2026-01-01') + intDiv(number,1000000), number, arrayElement(['click','view','purchase'], rand() % 3 + 1), rand()%1000/10.0 FROM numbers(20000000)" &
done

# Run queries while ingestion is ongoing
clickhouse-benchmark -f queries.sql -c 32 -t 180 > concurrent_results.txt

Collecting metrics and artifacts

Save the following for each run and include them with any published benchmark so readers can verify conditions: fio outputs, nvme smart logs, /proc/diskstats, iostat, vmstat, ClickHouse system metrics (system.metrics, system.events, and metrics from --metrics-port/prometheus), and the ClickHouse server config.

How to analyze results — what matters

Don't just compare raw throughput numbers. For analytics workloads, focus on these metrics in priority order:

  • P99 and P99.9 latency for queries — these determine user and downstream-job SLAs.
  • Sustained throughput during mixed read/write phases.
  • Performance drift over time — especially after sustained writes (PLC controllers may throttle or degrade performance during GC events).
  • SMART health & wear — estimate TBW used and projected lifespan under your write profile.
  • CPU offload / controller differences that might affect query CPU time indirectly (e.g., if PLC drive causes more IO wait).

Cost-per-query model (practical, reproducible)

Analytics teams need a simple way to translate hardware costs into operational unit economics. This model amortizes storage cost over query volume and includes power and replacement for endurance-limited devices.

Model variables

  • Drive_Cost (USD) — purchase cost of the drive.
  • Capacity (GB).
  • Annual_Power_Cost (USD/year) — device power draw * electricity rate.
  • Expected_Lifetime_Years — based on TBW and your write rate.
  • Total_Queries_Per_Year — estimated queries hitting that drive's storage (or cluster-wide allocated proportion).

Cost-per-query formula

Basic formula, per-drive:

Amortized_Hardware_Per_Query = Drive_Cost / (Expected_Lifetime_Years * Total_Queries_Per_Year)
Amortized_Power_Per_Query = Annual_Power_Cost / Total_Queries_Per_Year
Cost_Per_Query = Amortized_Hardware_Per_Query + Amortized_Power_Per_Query + (Maintenance_Percentage * Drive_Cost) / Total_Queries_Per_Year

Example calculation (numbers for illustration)

Assume a 2 TB NVMe costs $300, PLC 4 TB costs $400 (PLC often cheaper $/GB for larger capacities). Suppose the cluster serves 200M queries/year per drive's data footprint and power difference is negligible for short-living query CPU costs.

  • NVMe: Drive_Cost = $300, Lifetime = 5 years => hardware per query = $300/(5*200M) = $0.0000003 (0.00003¢)
  • PLC: Drive_Cost = $400, Lifetime = 3 years (lower endurance) => hardware per query = $400/(3*200M) = $0.000000667 (0.0000667¢)
  • If PLC gives better capacity (4 TB vs 2 TB), you can allocate more queries per drive, effectively halving the per-query cost. Also, if PLC reduces the cluster node count, that reduces per-query cost materially.

The point: raw Drive_Cost alone is not decisive. You must model capacity, lifetime, and the impact on cluster sizing (fewer nodes because of higher capacity can lower per-query compute costs and amortize PLC benefits).

Interpreting PLC vs NVMe for ClickHouse — practical takeaways

  • Read-heavy, large-volume OLAP: PLC is attractive. ClickHouse read amplification is low; queries predominantly scan or use compressed columns, so PLC's lower $/GB and higher density can reduce overall TCO.
  • Write-heavy ingestion or frequent merges: NVMe likely better due to higher endurance and more consistent write latency. PLC's controller firmware garbage collection can cause latency spikes that garbage the P99 of ingestion-sensitive pipelines.
  • Mixed workloads with strict SLAs: prefer NVMe or hybrid designs — use PLC for cold/warm datasets and NVMe for hot, frequently-updated partitions.
  • Capacity-driven consolidation: if PLC enables halving the number of nodes, compute and networking savings often outweigh slightly higher per-drive failure risk. Run your cost-per-query model to validate and consider edge migration patterns if you're distributing storage across regions.

Migration patterns and risk mitigation

If you decide to pilot PLC in production, follow these steps to reduce risk.

  1. Start with cold/warm tiers: move historical partitions or infrequently mutated data to PLC-backed nodes.
  2. Use over-provisioning and reserve 20–30% of capacity for controller-level wear leveling (set drive OP if vendor supports).
  3. Enable SMART monitoring and integrate into your alerting pipeline for TBW and media errors.
  4. Run periodic endurance tests in staging that mimic your production compaction and merge patterns.
  5. Plan for replacement windows — PLC may require more frequent replacements; ensure spare inventory and automation for node rollouts.

Common pitfalls — what teams get wrong

  • Overemphasizing peak throughput — peak fio numbers don't reflect real ClickHouse queries and tail latency under concurrency.
  • Ignoring write amplification — ClickHouse merges cause write amplification, especially for MergeTree tables with many small parts; factor this into TBW usage estimates.
  • Skipping long runs — short benchmarks miss controller GC cycles that degrade PLC performance after sustained writes.
  • Failing to publish artifacts — without raw outputs, others can't validate your claims; publish fio logs, ClickHouse metrics, drive SMART logs, and config files.

Expect continued PLC maturation in 2026: controller firmware, error correction, and enterprise-grade over-provisioning will improve. Vendor ecosystems will release PLC-optimized NVMe firmware and enterprise SKUs. ClickHouse's enterprise ecosystem is growing rapidly (major funding and platform commitments through early 2026), increasing pressure on vendors to provide predictable, high-density storage tuned for OLAP.

Prediction: by late 2026, hybrid cluster tiering will be the dominant pattern — hot NVMe tiers for ingestion and strict SLAs, and PLC-backed warm tiers for high-capacity, lower-cost storage of historical data. Teams that invest in reproducible benchmarking and automation will be best positioned to exploit the cost advantages without sacrificing reliability.

Publishing your benchmark — reproducibility checklist

When you publish results, include:

  • Hardware BOM (exact drive models, firmware versions).
  • Software stack (OS kernel, ClickHouse version, driver versions).
  • All scripts and raw output files (fio, clickhouse-benchmark, nvme smart logs).
  • Metric dashboards or exported CSVs (P50/P95/P99 values and time-series).
  • Cost model spreadsheet and assumptions used in cost-per-query calculations.

Sample GitHub repo layout (what to include)

clickhouse-plc-benchmarks/
├── docs/                  # methodology and README
├── scripts/
│   ├── run_fio.sh
│   ├── run_clickhouse_bench.sh
│   └── collect_metrics.sh
├── configs/
│   └── clickhouse-server.xml
├── results/
│   └── nvme/ plc/ (timestamped runs)
├── cost_model/            # spreadsheet and python calculator
└── LICENSE

Final recommendations — actionable next steps for analytics teams

  1. Clone the example repo and fill in your drive models and cluster sizing assumptions.
  2. Run fio microbenchmarks on candidate devices to validate vendor claims in your environment.
  3. Run the ClickHouse OLAP and mixed-workload benchmarks for 48–72 hours to reveal endurance effects.
  4. Calculate cost-per-query including replacement cadence and cluster consolidation benefits.
  5. Pilot PLC for warm partitions only, with SMART-based alerting and automated node replacement in place.
"Data-first decisions beat vendor brochures. Measure in your workload, publish artifacts, and make the storage tradeoff explicit in cost-per-query terms." — Recommended team operating principle, 2026

Call to action

Ready to run this benchmark in your environment? Grab the reproducible toolkit and starter scripts from our repo and open an issue with your drive models — we’ll help interpret results and tune ClickHouse configs for PLC media. Visit https://github.com/untied-dev/clickhouse-plc-benchmarks to get started, or contact us for a hands-on benchmarking engagement.

Advertisement

Related Topics

#Databases#Benchmarking#Storage
u

untied

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-01-27T21:26:20.508Z