GDPR-Compliant RUM Without Cookies

You want field performance data — real LCP, INP, and CLS from real users — but legal has flagged that every persistent identifier on the page now needs consent under GDPR Article 5(3) and the ePrivacy Directive. The moment you write a cookie, localStorage key, or fingerprint hash to stitch pageviews together, you cross into “storing or accessing information on a terminal,” which triggers a consent gate that blocks the very beacon you need. This page, part of Privacy-Compliant Tracking, shows the exact client and edge code for RUM that stores nothing on the device, carries no PII, and therefore needs no consent banner to fire — while still producing statistically valid p75 numbers.

The architecture rests on four hard rules: a per-pageload id that lives only in a JavaScript variable (never persisted), IP addresses truncated at the edge before any application code sees them, URLs stripped of query strings before transmission, and zero cross-session linking. Get those four right and the telemetry is, in regulatory terms, non-personal aggregate performance data.

Cookieless, PII-free RUM data flow The browser mints an ephemeral UUID in memory, strips the query string, and beacons to an edge worker that truncates the IP before writing aggregate-only rows. Browser (memory) crypto.randomUUID() drop ?query no storage beacon Edge worker truncate IP octet drop Cookie/Referer validate schema insert Store aggregate rows 30-day TTL p75 views What never leaves the device or persists anywhere: No cookie, no localStorage, no fingerprint, no full IP, no raw URL Kept: metric values, route path, coarse country, truncated network class
The ephemeral id exists only to dedupe metrics within one pageload; nothing links two loads. See the parent Privacy-Compliant Tracking overview for the full legal model.

Prerequisites

Before wiring this up, confirm the following are already in place:

  • A first-party ingestion host you control (a same-site path like /rum or a subdomain), so the beacon is not a third-party request. The mechanics of receiving and validating these payloads belong to Self-Hosted Beacon Collection.
  • An edge layer (Cloudflare Worker, Fastly, or an Nginx reverse proxy) that can rewrite request headers and read the connecting IP before your application does.
  • A columnar or time-series store (ClickHouse, TimescaleDB, BigQuery) where rows carry no row-level identity.
  • A documented lawful basis. Cookieless, PII-free RUM normally rests on legitimate interest under Article 6(1)(f), recorded in a short legitimate-interest assessment.
  • Agreement with legal/DPO on a retention window. This guide assumes a rolling 30 days.

Threshold reference

Aggregate at the 75th percentile so a single slow tail load does not move the headline number. Use these exact bands when you build alerting and dashboards on the collected data:

Metric Good Needs Improvement Poor
LCP ≤ 2.5 s ≤ 4.0 s > 4.0 s
INP ≤ 200 ms ≤ 500 ms > 500 ms
CLS ≤ 0.1 ≤ 0.25 > 0.25
FCP ≤ 1.8 s
TTFB ≤ 800 ms

How to implement cookieless, PII-free RUM

1. Mint an ephemeral id in memory only

Generate a per-pageload UUID with crypto.randomUUID() and hold it in a closure variable. It is never written to document.cookie, localStorage, sessionStorage, or IndexedDB, so it cannot be read on a later load. Its only job is to deduplicate the partial and final beacon from the same pageload at ingestion.

// Lives only in this module's scope; gone when the page unloads.
const pageId = crypto.randomUUID();
const metrics = Object.create(null);

Why: because the id never reaches a storage API, there is no “access to information stored on the terminal,” and two separate visits produce two unrelated ids — making cross-session linking technically impossible rather than merely promised.

2. Capture vitals with the web-vitals library

Use the web-vitals library so INP, LCP, and CLS follow Google’s exact algorithms instead of an approximation. Register handlers as early as possible so buffered entries are not missed.

import { onLCP, onINP, onCLS, onFCP, onTTFB } from 'web-vitals';

function record({ name, value }) {
  // Keep only the metric name and rounded value — nothing else.
  metrics[name] = Math.round(value * 1000) / 1000;
}

onLCP(record);
onINP(record);
onCLS(record);
onFCP(record);
onTTFB(record);

Why: the library handles cross-origin LCP timing, the INP rolling buffer, and CLS session-window logic correctly, so your privacy guarantees do not come at the cost of metric accuracy.

3. Strip the URL down to the route path

Before building the payload, drop the query string and fragment. Query strings routinely carry email tokens, search terms, and ad identifiers — all potential PII. Send only location.pathname, optionally normalized to a route template.

function safePath() {
  // Drop ?query and #fragment; collapse numeric/uuid ids to a template.
  return location.pathname
    .replace(/\/\d+(?=\/|$)/g, '/:id')
    .replace(/\/[0-9a-f]{8}-[0-9a-f-]{27}(?=\/|$)/gi, '/:uuid');
}

Why: a raw URL such as /reset?token=abc123&email=a@b.com is personal data; the templated path /reset is not, and it also aggregates far better at p75 across thousands of distinct ids.

4. Build a payload that carries no identity

Assemble only metric values, the safe path, the ephemeral pageId, and coarse, high-entropy-free context. No user agent string, no precise viewport, no connection RTT.

function buildPayload(isFinal) {
  return {
    v: 1,
    pid: pageId,                 // ephemeral; dedupes this load's beacons only
    path: safePath(),
    final: isFinal,
    nav: performance.getEntriesByType('navigation')[0]?.type ?? 'navigate',
    // Coarse buckets only — not a fingerprint.
    dpr: Math.round(devicePixelRatio),
    net: navigator.connection?.effectiveType ?? 'unknown',
    metrics: { ...metrics }
  };
}

Why: every field here is either a metric or a low-cardinality bucket. There is no signal precise enough to single out one person, which keeps the dataset out of scope for identifier-based consent.

5. Transmit with sendBeacon on page hide

Send via sendBeacon, which is non-blocking and survives unload. Fire on visibilitychange → hidden (the last reliable hook on mobile) with a fetch(..., {keepalive:true}) fallback. Because the request is first-party and carries no cookie, no consent gate applies.

function send(isFinal) {
  const body = JSON.stringify(buildPayload(isFinal));
  const blob = new Blob([body], { type: 'application/json' });
  if (!navigator.sendBeacon('/rum', blob)) {
    fetch('/rum', { method: 'POST', body: blob, keepalive: true,
                    headers: { 'Content-Type': 'application/json' } })
      .catch(() => {});
  }
}

addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'hidden') send(true);
});
// Safari sometimes skips the hidden transition — pagehide is the backstop.
addEventListener('pagehide', () => send(true));

Why: sending on hide guarantees INP and the final CLS value are included, and the keepalive fallback covers browsers that suspend sendBeacon during service-worker activation.

6. Truncate the IP and scrub headers at the edge

The IP address is PII, and it reaches your network before any JavaScript runs — so it must be handled at the edge, not in the app. This Cloudflare Worker zeroes the last IPv4 octet (or last 80 bits of IPv6), drops Cookie and Referer, and forwards only the safe fields.

export default {
  async fetch(request, env) {
    if (request.method !== 'POST') return new Response(null, { status: 405 });

    const ip = request.headers.get('CF-Connecting-IP') ?? '';
    const truncated = ip.includes(':')
      ? ip.split(':').slice(0, 3).join(':') + '::'   // IPv6: keep /48
      : ip.split('.').slice(0, 3).concat('0').join('.'); // IPv4: /24

    let payload;
    try { payload = await request.json(); }
    catch { return new Response('bad json', { status: 400 }); }

    // Reject anything not in the allowlist — blocks accidental PII leakage.
    const allowed = ['v', 'pid', 'path', 'final', 'nav', 'dpr', 'net', 'metrics'];
    if (Object.keys(payload).some((k) => !allowed.includes(k))) {
      return new Response('unexpected field', { status: 422 });
    }

    const row = {
      ...payload,
      net_subnet: truncated,                 // coarse geo only, never raw IP
      country: request.cf?.country ?? 'XX',  // country-level, no city
      received_at: Date.now()
    };

    await env.RUM_QUEUE.send(row);
    return new Response(null, { status: 204 });
  }
};

Why: truncating to a /24 (IPv4) or /48 (IPv6) preserves country and ASN-level routing for geographic segmentation while removing the precision needed to identify a household or device.

7. Sample deterministically and document retention

Apply sampling at p75 so high-traffic routes do not overwhelm storage, and enforce a hard TTL so erasure is automatic rather than a manual workflow.

// Head-based: decide per pageload, before any observers attach.
function shouldSample(rate = 0.2) {
  return Math.random() < rate; // no stored seed; no cross-load correlation
}
-- ClickHouse: data deletes itself after 30 days — no DSAR delete path needed.
ALTER TABLE rum_events MODIFY TTL received_at + INTERVAL 30 DAY;

Why: a fixed TTL is your right-to-erasure mechanism. Since rows hold no identity, there is no individual to erase on request — the dataset simply ages out deterministically, which you record in your privacy notice.

Verifying it works

  • DevTools › Application › Storage: reload the page and confirm Cookies, Local Storage, Session Storage, and IndexedDB are all empty for your origin. Any entry here breaks the cookieless claim.
  • DevTools › Network: filter for /rum, trigger a tab switch, and confirm a beacon request returning 204. Inspect the request payload — there must be no Cookie header reaching the edge and no query string inside path.
  • Edge logs: confirm the stored net_subnet ends in .0 (IPv4) or :: (IPv6) for every row. Grep ingestion logs for token=, email=, @, or full IPs — zero matches is the pass condition.
  • RUM dashboard: verify two reloads in the same browser produce two distinct pid values with no shared session row. If you can join two pageloads, cross-session linking has leaked in.
  • Schema rejection test: POST a payload with an extra userId field and confirm a 422, proving the allowlist blocks unexpected keys.

Edge cases & gotchas

  • Service workers intercepting beacons: a fetch handler that rewrites or caches the /rum request can strip keepalive or add headers. Pass the request through untouched and rely on the keepalive fallback.
  • bfcache restores: a page restored from back/forward cache keeps the same pageId in memory, so a second visibilitychange can double-send. Guard with a let sent = false flag, or rely on the ingestion pid + final dedupe.
  • Proxies overwriting the connecting IP: if a CDN sits in front of your edge, CF-Connecting-IP (or X-Forwarded-For) may already be rewritten. Truncate the first hop only, and never trust a client-supplied X-Forwarded-For.
  • Safari and the missing hidden event: older Safari can skip the hidden visibility transition on navigation, so pagehide must remain as a backstop or you lose the final INP/CLS beacon.
  • Cross-origin LCP without Timing-Allow-Origin: an LCP image served cross-origin without Timing-Allow-Origin exposes loadTime but masks renderTime. The web-vitals library already falls back, but expect slightly coarser LCP for those resources.
  • Low-cardinality fields becoming identifying: if you add too many “coarse” buckets (exact viewport, RTT, device memory), their combination can fingerprint. Keep the context fields to the minimum shown above.

FAQ

No. The consent obligation in ePrivacy attaches to storing or accessing information on the user’s device. A crypto.randomUUID() value held in a JavaScript variable is never written to any storage API and is destroyed on unload, so it does not trigger the storage-access rule.

Can I still calculate p75 without per-user sessions?

Yes. Percentiles are computed across the population of pageloads, not per user. Each beacon contributes one observation per metric; aggregating those at the 75th percentile by route, country, and network class gives stable field numbers without ever linking loads to a person.

Why truncate the IP at the edge instead of in the application?

The raw IP arrives at your infrastructure before application code runs and is itself personal data. Truncating to a /24 or /48 at the edge means the full address is never logged, stored, or processed downstream, which is far stronger than masking it after the fact.