Self-Hosted RUM Pipeline with ClickHouse

You have field metrics arriving from navigator.sendBeacon() and you need somewhere to put them that answers “what is our p75 LCP for mobile users in Germany over the last 7 days?” in milliseconds, not seconds — without paying per-event vendor pricing. This page walks the exact, end-to-end build of that pipeline on ClickHouse: a tiny ingestion endpoint, a MergeTree table sorted for time-series reads, async inserts to tame part fragmentation, and a materialized view that pre-aggregates p75 rollups so dashboards never scan raw rows. It is the storage-and-query layer that sits behind self-hosted beacon collection; if you have not yet shipped a collector, read that first and return here once beacons are landing.

The hard part is not the SQL — it is choosing an ORDER BY that matches your dashboard predicates, controlling write amplification so background merges keep up, and computing percentiles cheaply enough to serve interactive dashboards. We cover all three with real, runnable code.

Self-hosted RUM pipeline data flow Browser sends beacons to an ingestion endpoint, which async-inserts JSON rows into a ClickHouse MergeTree table; a materialized view continuously rolls raw rows into per-day p75 aggregates that dashboards read. Browser web-vitals + sendBeacon Ingestion endpoint validate + async insert MergeTree rum_events (raw) Materialized view p75 daily rollup Grafana reads rollup, not raw Raw rows expire via TTL; the rollup stays small and is what dashboards scan.
Beacons land on an ingestion endpoint, async-insert into a raw MergeTree, and a materialized view continuously computes per-day p75 rollups. See Grafana dashboards for web performance for the read side.

Prerequisites

Before you run any SQL below, confirm the following are in place:

  • A ClickHouse server (24.x or newer) reachable over HTTP on port 8123, with a database to own these tables. A single node is fine for under ~50k beacons/second; shard only when one node’s merges fall behind.
  • A working collector emitting one JSON object per page view, captured with the web-vitals library and PerformanceObserver and flushed on visibilitychange. The fields this schema expects are listed in the table below.
  • A decision on sampling. If you are keeping 100% of traffic you can skip it, but for high-volume sites apply deterministic session sampling that preserves the p75 aggregation at the collector so the percentile math stays unbiased.
  • Node 18+ (for the fetch-based inserter) or a Cloudflare Workers runtime if your endpoint lives at the edge.

The beacon contract this pipeline assumes:

Field Type Source Notes
timestamp DateTime64(3) server clock set on the endpoint, never trust the client clock
project_id LowCardinality(String) collector config a handful of distinct values
session_id String first-party token used for sampling and stitching, not stored long-term
page_path LowCardinality(String) location.pathname path only — strip query strings to bound cardinality
device_tier LowCardinality(String) client hints e.g. low / mid / high
country LowCardinality(String) edge geo header derived server-side, never from the payload
lcp_ms UInt32 web-vitals 0 when not captured
inp_ms UInt32 web-vitals 0 when not captured
cls Float32 web-vitals unitless

How to build the pipeline

1. Create the raw events table

Start with the MergeTree table that receives every beacon. The ORDER BY is the single most important decision: ClickHouse stores rows physically in this order and uses it as a sparse primary index, so it must lead with the columns your dashboards filter on and end with timestamp for time-range scans.

CREATE TABLE rum_events
(
    timestamp     DateTime64(3),
    project_id    LowCardinality(String),
    session_id    String,
    page_path     LowCardinality(String),
    device_tier   LowCardinality(String),
    country       LowCardinality(String),
    lcp_ms        UInt32 CODEC(T64, ZSTD(1)),
    inp_ms        UInt32 CODEC(T64, ZSTD(1)),
    cls           Float32 CODEC(ZSTD(1))
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(timestamp)
ORDER BY (project_id, device_tier, country, timestamp)
TTL toDateTime(timestamp) + INTERVAL 90 DAY DELETE
SETTINGS index_granularity = 8192;

Why each choice matters:

  • ORDER BY (project_id, device_tier, country, timestamp) — dashboards almost always filter by project, often by device class and geography, then scan a time window. Leading with the low-cardinality dimensions lets the sparse index skip whole granules before it ever touches a time range. Putting timestamp last keeps each (project, tier, country) slice contiguous and time-ordered.
  • PARTITION BY toYYYYMM(timestamp) — monthly partitions keep the partition count low (avoid daily partitions, which produce too many parts) while still letting TTL and old-data drops operate at partition granularity.
  • LowCardinality(String) — for categorical dimensions this dictionary-encodes values, cutting disk footprint 60–80% and making dimensional filtering far faster than plain String.
  • CODEC(T64, ZSTD(1))T64 transposes integer bits so similar magnitudes compress together before ZSTD; ideal for millisecond timings clustered in a narrow range.
  • TTL ... DELETE — raw rows self-expire after 90 days. The rollup view (step 4) holds the long-term history, so raw retention only needs to cover ad-hoc drill-downs.

2. Write the ingestion endpoint

The endpoint’s job is small and strict: parse one JSON beacon, stamp the server timestamp and geo, reject anything malformed, and hand the row to ClickHouse. Critically, it inserts with async inserts enabled so ClickHouse — not your endpoint — does the batching. This is what lets you accept one HTTP request per beacon without melting the merge scheduler.

// ingest.js — minimal Node 18+ RUM ingestion endpoint
import { createServer } from "node:http";

const CH_URL = process.env.CLICKHOUSE_URL ?? "http://localhost:8123";
const CH_AUTH = "Basic " + Buffer.from(
  `${process.env.CH_USER ?? "default"}:${process.env.CH_PASSWORD ?? ""}`
).toString("base64");

// async_insert lets ClickHouse buffer many single-row INSERTs into one part.
const INSERT_SQL =
  "INSERT INTO rum_events FORMAT JSONEachRow " +
  "SETTINGS async_insert=1, wait_for_async_insert=0";

function clamp(n, max) {
  const v = Number(n);
  return Number.isFinite(v) && v >= 0 && v <= max ? Math.round(v) : 0;
}

createServer((req, res) => {
  if (req.method !== "POST" || !req.url.startsWith("/rum/collect")) {
    res.writeHead(404).end();
    return;
  }
  let body = "";
  let size = 0;
  req.on("data", (chunk) => {
    size += chunk.length;
    if (size > 64 * 1024) req.destroy();        // bound payload to 64KB
    body += chunk;
  });
  req.on("end", async () => {
    let b;
    try { b = JSON.parse(body); } catch { res.writeHead(400).end(); return; }

    const row = {
      timestamp: new Date().toISOString().replace("T", " ").replace("Z", ""),
      project_id: String(b.project_id ?? "unknown").slice(0, 64),
      session_id: String(b.session_id ?? "").slice(0, 64),
      page_path: String(b.page_path ?? "/").split("?")[0].slice(0, 256),
      device_tier: String(b.device_tier ?? "unknown").slice(0, 16),
      country: req.headers["cf-ipcountry"] ?? "ZZ",   // geo from edge header
      lcp_ms: clamp(b.lcp_ms, 120000),
      inp_ms: clamp(b.inp_ms, 60000),
      cls: Math.max(0, Math.min(Number(b.cls) || 0, 10)),
    };

    const r = await fetch(`${CH_URL}/?query=${encodeURIComponent(INSERT_SQL)}`, {
      method: "POST",
      headers: { Authorization: CH_AUTH, "Content-Type": "application/json" },
      body: JSON.stringify(row) + "\n",
    });
    res.writeHead(r.ok ? 204 : 502).end();
  });
}).listen(8787, () => console.log("RUM ingest on :8787"));

Why this shape:

  • async_insert=1, wait_for_async_insert=0 — every beacon arrives as its own HTTP request, but ClickHouse coalesces them into one part flushed on a size/time threshold. Without this, one-row-per-INSERT traffic creates thousands of tiny parts per minute and background merges never catch up (the “too many parts” failure mode).
  • Server-stamped timestamp and country — the client clock is untrustworthy and the payload must never carry geo; both are set at the trusted edge.
  • 64KB cap and field clamping — bound the payload to the sendBeacon queue limit and coerce every numeric field into a sane range so a malformed beacon can never poison an aggregate.

Deploying at the edge instead? The same logic ports directly to a Worker — see building a RUM ingestion endpoint on Cloudflare Workers for the runtime-specific batching with ctx.waitUntil.

3. Tune async-insert behaviour

Async inserts are governed by two thresholds; the first to trip flushes the buffer. Set them as profile settings so every inserting connection inherits them.

-- Flush a buffered batch at 1 MB of data OR after 1 second, whichever first.
ALTER USER default SETTINGS
    async_insert_max_data_size = 1048576,
    async_insert_busy_timeout_ms = 1000;

Why: a 1 second / 1 MB window keeps parts large enough that merges stay ahead, while bounding visibility lag to about a second — fast enough for an operational RUM dashboard. Raise the timeout if your traffic is bursty and you want fewer, larger parts; lower it if you need near-real-time rows and can tolerate more parts.

4. Add the p75 rollup materialized view

Scanning raw rows for every dashboard refresh wastes CPU. A materialized view computes per-day, per-dimension percentile states as rows insert, so dashboards read a table that is orders of magnitude smaller. Percentile state is stored with quantileTDigestState, which is mergeable across granules and partitions — exactly what an aggregating rollup needs.

CREATE TABLE rum_rollup_daily
(
    day          Date,
    project_id   LowCardinality(String),
    device_tier  LowCardinality(String),
    country      LowCardinality(String),
    sessions     AggregateFunction(count),
    lcp_p75      AggregateFunction(quantileTDigest(0.75), UInt32),
    inp_p75      AggregateFunction(quantileTDigest(0.75), UInt32),
    cls_p75      AggregateFunction(quantileTDigest(0.75), Float32)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (project_id, device_tier, country, day);

CREATE MATERIALIZED VIEW rum_rollup_daily_mv TO rum_rollup_daily AS
SELECT
    toDate(timestamp)                       AS day,
    project_id,
    device_tier,
    country,
    countState()                            AS sessions,
    quantileTDigestState(0.75)(lcp_ms)      AS lcp_p75,
    quantileTDigestState(0.75)(inp_ms)      AS inp_p75,
    quantileTDigestState(0.75)(cls)         AS cls_p75
FROM rum_events
WHERE lcp_ms > 0
GROUP BY day, project_id, device_tier, country;

Why this works:

  • quantileTDigestState stores a compact t-digest sketch, not the raw values. AggregatingMergeTree merges these sketches as parts compact, so the rollup stays a few thousand rows per day regardless of beacon volume. Use quantileTDigest (t-digest) over quantileExact here because it is mergeable and memory-bounded; reserve quantileExact for one-off audits where you need the true percentile.
  • WHERE lcp_ms > 0 drops rows where LCP was never captured (bfcache restores, instant aborts) so they do not deflate the percentile.
  • The view only transforms rows inserted after it is created. Backfill history once with INSERT INTO rum_rollup_daily SELECT ... FROM rum_events using the same aggregation.

5. Query the rollup from a dashboard

Dashboards read the rollup and call the matching Merge-combinator function to finalize the sketch into a number. This scans the small aggregate table, not raw events.

SELECT
    day,
    countMerge(sessions)                  AS sessions,
    round(quantileTDigestMerge(0.75)(lcp_p75))  AS p75_lcp_ms,
    round(quantileTDigestMerge(0.75)(inp_p75))  AS p75_inp_ms,
    round(quantileTDigestMerge(0.75)(cls_p75), 3) AS p75_cls
FROM rum_rollup_daily
WHERE project_id = 'prod-web'
  AND device_tier = 'mid'
  AND day >= today() - 28
GROUP BY day
ORDER BY day DESC;

This is the query you point a Grafana panel at. Because (project_id, device_tier, country, day) leads the rollup’s ORDER BY, the dimension filters hit the primary index directly and the scan touches only the relevant granules.

Verifying it works

Confirm the pipeline end-to-end, in order:

  1. Beacon lands. Fire a synthetic beacon and expect HTTP 204:
    curl -s -o /dev/null -w "%{http_code}\n" -X POST \
      -H "Content-Type: application/json" \
      -d '{"project_id":"prod-web","session_id":"t1","page_path":"/home","device_tier":"mid","lcp_ms":1900,"inp_ms":120,"cls":0.04}' \
      http://localhost:8787/rum/collect
  2. Row is queryable after the async-insert window (≤ ~1 s):
    SELECT count(), max(timestamp) FROM rum_events WHERE project_id = 'prod-web';
  3. Rollup populated. The materialized view should have a matching daily row:
    SELECT day, countMerge(sessions)
    FROM rum_rollup_daily WHERE project_id = 'prod-web' GROUP BY day;
  4. Merges are keeping up. Active part count per partition should stay low (tens, not hundreds):
    SELECT partition, count() AS parts
    FROM system.parts WHERE table = 'rum_events' AND active
    GROUP BY partition ORDER BY parts DESC;
  5. The index is used. Confirm dashboard queries prune granules instead of full-scanning:
    EXPLAIN indexes = 1
    SELECT quantileTDigestMerge(0.75)(lcp_p75) FROM rum_rollup_daily
    WHERE project_id = 'prod-web' AND day >= today() - 7;
    The ReadFromMergeTree node should show Granules: <selected>/<total> with selected far below total.

In a live dashboard, a healthy pipeline shows a p75 LCP line that tracks the 2.5 s Good boundary, p75 INP under 200 ms, and p75 CLS under 0.1 — the current Google field-data thresholds.

Edge cases & gotchas

  • “Too many parts” under async inserts. If you still see this, your async-insert window is too small or wait_for_async_insert=0 is being overridden per query. Raise async_insert_busy_timeout_ms, and never wrap async inserts in a per-row loop that also forces a sync flush.
  • Zero-value metrics skew percentiles. Beacons from bfcache restores or aborted loads report 0 for uncaptured metrics. The rollup’s WHERE lcp_ms > 0 filter handles LCP, but apply the same guard per metric if you split them into separate views — otherwise zeros drag p75 below reality.
  • LowCardinality on truly high-cardinality columns backfires. page_path is LowCardinality only because query strings are stripped. If you store full URLs with parameters the dictionary explodes and both inserts and reads slow down. Always normalize the path server-side.
  • Materialized views are insert-time only. Creating the view does not transform existing rows, and altering the view’s SELECT does not re-process history — you must backfill manually. Test view logic on a sampled subset before pointing it at production inserts.
  • TTL deletes raw rows the rollup still implies. Raw rum_events expires at 90 days but rum_rollup_daily has no TTL by design. If you add one, make it strictly longer than any dashboard’s time window, or trend charts will silently truncate.
  • Clock trust. Never ORDER BY or partition on a client-supplied timestamp. A single device with a wrong clock can scatter rows across partitions and wreck merge locality; always stamp time at the endpoint.

FAQ

Why a materialized view instead of querying raw events directly?

Interactive dashboards re-run their query on every refresh and time-range change. Scanning raw rows for percentiles each time burns CPU proportional to traffic. The materialized view pre-aggregates t-digest sketches at insert time, so reads scan a small rollup table whose size depends on dimension count, not beacon volume.

Should I use quantileTDigest or quantileExact for p75?

Use quantileTDigest for rollups and dashboards: its sketch is mergeable across parts and memory-bounded, which is what makes the AggregatingMergeTree rollup viable. Reserve quantileExact for occasional audits where you need the precise percentile and can afford to scan and hold all values in memory.

How do async inserts prevent the “too many parts” error?

Each beacon arrives as its own request, but with async_insert=1 ClickHouse buffers rows server-side and flushes them as one part when the size or time threshold trips. That turns thousands of tiny single-row parts into a few large ones, keeping background merges ahead of ingestion.