Datadog vs New Relic vs Self-Hosted RUM
Picking a Real-User Monitoring platform is a multi-year commitment to a cost curve, a data-residency posture, and a ceiling on how much raw field data you can ever query. This page is a focused head-to-head — Datadog RUM versus New Relic Browser versus a self-hosted stack built from the web-vitals library and PerformanceObserver, Cloudflare Workers, ClickHouse, and Grafana — and it sits inside the broader RUM Vendor Comparison evaluation. Where the parent page surveys the field, this one pins down three concrete options, gives a worked monthly-cost estimate for each at ~10M pageviews, and ends with numbered evaluation steps that include the actual SDK init calls you would ship.
The short version: Datadog and New Relic both buy you near-instant time-to-value and zero operational burden, but bill on session/event volume and gate raw-data export behind their query layer. A self-hosted stack inverts that — higher up-front engineering, near-zero marginal cost per beacon, and unbounded access to raw events for p75 aggregation and ad-hoc analysis. The comparison below makes those trade-offs explicit so the decision is evidence-driven.
Prerequisites
Before you run this evaluation, have the following in place so each option is measured on equal footing:
- A representative traffic figure. This page assumes ~10M monthly pageviews, roughly 3.3M sessions/month at 3 pageviews per session. Substitute your own numbers — the cost math is linear in sessions/ingest.
- Agreement on which signals you must collect. At minimum the three Core Web Vitals — Largest Contentful Paint, Interaction to Next Paint, and Cumulative Layout Shift — plus FCP and TTFB as diagnostics.
- A clear residency requirement (EU-only, US-only, or no constraint). This eliminates options fast.
- A decision on whether you need raw per-event export for joins against business data, or whether vendor dashboards suffice.
- For the self-hosted path: a Cloudflare account for the beacon ingestion endpoint, a ClickHouse instance (managed or self-run), and Grafana.
The comparison table
The three options measured against the dimensions that actually move a decision. Costs are list-price order-of-magnitude estimates at ~10M monthly pageviews; always re-price against a current quote.
| Dimension | Datadog RUM | New Relic Browser | Self-hosted (web-vitals + Workers + ClickHouse + Grafana) |
|---|---|---|---|
| CWV support (LCP/INP/CLS) | Native, all three incl. INP attribution | Native, all three incl. INP | Native via web-vitals v4 (INP included), full attribution build |
| Sampling control | Session sessionSampleRate, replay rate separate |
Configurable, per-app; less granular | Total — head- or tail-based in your Worker |
| Raw-data export | Via API/RUM query, rate-limited, retention-bound | NRQL + API export, retention-bound | Unlimited — it is your ClickHouse table |
| Cost at ~10M PV/mo | ~$4,500–6,000/mo (session-billed) | ~$2,500–4,500/mo (ingest-billed) | ~$300–700/mo infra + engineering time |
| Data residency | US/EU datacenter pick at signup | US/EU region pick | Anywhere you deploy (full control) |
| Time-to-value | Hours | Hours | 1–3 weeks initial build |
| Retention | 30 days RUM events (plan-dependent) | 8–30 days events (plan-dependent) | Unbounded (TTL is your policy) |
| Operational burden | None | None | You own ingest, storage, dashboards |
Worked monthly-cost estimates
The numbers below are illustrative at ~10M pageviews / ~3.3M sessions per month. Re-derive them from a live quote, but the shape of each curve is stable.
Assumptions
Pageviews/month = 10,000,000
Pageviews/session = 3
Sessions/month = 3,333,333
Beacon payload = ~1.2 KB/pageview (one batched vitals beacon)
Raw ingest volume = 10,000,000 * 1.2 KB = ~12 GB/month
Datadog RUM (session-billed). Pricing is per 1,000 sessions. At ~3.33M sessions, a representative list rate lands the RUM line around $4,500–6,000/month before Session Replay, which is billed separately and can double the bill if enabled at a high rate. Lowering sessionSampleRate to 0.25 cuts billed sessions proportionally — sampling is the primary cost lever here.
New Relic Browser (ingest-billed). New Relic bills on data ingested (per GB) plus user seats. RUM ingest is far larger than 12 GB once you account for full PageView/Interaction events and attributes — call it 150–400 GB/month at this traffic. At list per-GB rates that is roughly $2,500–4,500/month including a small seat count. The lever here is dropping attributes and sampling at the agent.
Self-hosted. Fixed infrastructure dominates:
Cloudflare Workers (ingest) ~$5 base + ~$5/mo at 10M requests ≈ $10
ClickHouse (managed, small) ~$200-400/mo for 12 GB/mo + retention
Grafana (self-run or Cloud free tier) $0-50/mo
Object storage / backups ~$10/mo
-----------------------------------
Infra subtotal ~$250-470/mo
Engineering (amortized build) 1-3 weeks once, then ~2-4 hrs/mo upkeep
At ~10M pageviews the self-hosted infra bill is roughly $300–700/month plus the one-time build and light upkeep. The crossover point where self-hosting wins on pure spend is typically a few million sessions per month; below that, the managed time-to-value usually wins.
Decision guidance
- Choose Datadog RUM when you already run Datadog APM/logs and want vitals correlated with backend traces in one pane, you value Session Replay, and per-session billing at your volume is acceptable. Its strength is the unified trace-to-RUM story.
- Choose New Relic Browser when you want a single ingest-priced platform spanning APM, infra, and browser, and you can keep ingest disciplined. NRQL is a genuinely strong ad-hoc query surface short of full raw export.
- Choose self-hosted when you have a residency mandate that vendor regions don’t satisfy, you need raw per-event data joined to business tables, your session volume makes per-session billing punitive, or you require retention beyond 30 days. The cost is engineering ownership.
How to evaluate the three options
Run these numbered steps as a time-boxed bake-off. Each ships the same web-vitals signals so the dashboards are directly comparable.
-
Stand up Datadog RUM. Add the SDK and init it with explicit sampling so you control billed sessions from day one.
import { datadogRum } from '@datadog/browser-rum'; datadogRum.init({ applicationId: 'YOUR_APP_ID', clientToken: 'YOUR_CLIENT_TOKEN', site: 'datadoghq.eu', // pick the residency region here service: 'storefront-web', env: 'production', sessionSampleRate: 25, // 25% of sessions billed/collected sessionReplaySampleRate: 0, // replay off; it bills separately trackResources: true, trackLongTasks: true, defaultPrivacyLevel: 'mask-user-input', }); datadogRum.startSessionReplayRecording(); // omit if replay rate is 0Why: setting
sessionSampleRateat init is the single biggest cost control;sitefixes residency and cannot be changed without re-onboarding. -
Stand up New Relic Browser. Use the copy-paste browser agent, then trim ingest with sampling and attribute limits.
// Inject the New Relic browser loader (head), then configure the agent: window.NREUM = window.NREUM || {}; NREUM.init = { distributed_tracing: { enabled: true }, privacy: { cookies_enabled: false }, // cookieless RUM ajax: { deny_list: ['bam.nr-data.net'] }, }; NREUM.loader_config = { accountID: 'YOUR_ACCOUNT_ID', trustKey: 'YOUR_TRUST_KEY', agentID: 'YOUR_AGENT_ID', licenseKey: 'YOUR_BROWSER_LICENSE_KEY', applicationID: 'YOUR_APP_ID', };Why:
cookies_enabled: falsekeeps you cookieless, and pruning AJAX/attributes is how you keep the GB-ingest bill predictable. -
Build the self-hosted collector. Capture vitals with the web-vitals library and ship them via
sendBeaconto your own ingestion endpoint.import { onLCP, onINP, onCLS, onFCP, onTTFB } from 'web-vitals'; const queue = new Set(); const ENDPOINT = '/rum/collect'; function record(metric) { queue.add({ name: metric.name, value: metric.value, rating: metric.rating, // good | needs-improvement | poor id: metric.id, navigationType: metric.navigationType, page: location.pathname, ts: Date.now(), }); } onLCP(record); onINP(record); // INP captured here, same as the managed agents onCLS(record); onFCP(record); onTTFB(record); function flush() { if (queue.size === 0) return; const body = JSON.stringify([...queue]); queue.clear(); // sendBeacon survives page unload; keepalive fetch is the fallback if (!navigator.sendBeacon(ENDPOINT, body)) { fetch(ENDPOINT, { body, method: 'POST', keepalive: true }); } } addEventListener('visibilitychange', () => { if (document.visibilityState === 'hidden') flush(); }); addEventListener('pagehide', flush);Why: flushing on
visibilitychange/pagehideis the only reliable point to capture the final INP and CLS values before the tab is discarded. -
Validate and write at the edge. A Cloudflare Worker applies head-based sampling and inserts into ClickHouse. Sampling lives here so you never pay to store noise — see RUM data sampling strategies for head- vs tail-based trade-offs and p75 implications.
export default { async fetch(request, env) { if (request.method !== 'POST') return new Response('', { status: 405 }); // Head-based sampling: keep 25% of beacons, matching Datadog's rate if (Math.random() > 0.25) return new Response('', { status: 204 }); const events = await request.json(); const rows = events .filter((e) => typeof e.value === 'number') .map((e) => ({ name: e.name, value: e.value, rating: e.rating, page: e.page, ts: Math.floor(e.ts / 1000), })); await fetch(`${env.CLICKHOUSE_URL}/?query=INSERT INTO rum_events FORMAT JSONEachRow`, { method: 'POST', headers: { Authorization: `Basic ${env.CH_AUTH}` }, body: rows.map((r) => JSON.stringify(r)).join('\n'), }); return new Response('', { status: 204 }); }, };Why: keeping the sample rate identical across all three options is what makes the bake-off dashboards comparable — otherwise you’re comparing different denominators.
-
Query p75 in ClickHouse and panel it in Grafana. Validate that your self-hosted p75 lines up with the vendor dashboards.
SELECT name, quantile(0.75)(value) AS p75, count() AS samples FROM rum_events WHERE ts >= now() - INTERVAL 7 DAY GROUP BY name ORDER BY name;Why: p75 is the canonical headline aggregation for Core Web Vitals; computing it yourself confirms the self-hosted path produces the same ratings the managed tools report.
Verifying it works
- Datadog: open RUM → Performance and confirm LCP/INP/CLS p75 tiles populate within minutes. Check the billed sessions counter in Plan & Usage matches your
sessionSampleRate. - New Relic: run
SELECT percentile(largestContentfulPaint, 75) FROM PageViewTiming SINCE 1 day agoin the query builder and confirm a non-null result; watch Data Ingest in Administration to verify GB/day tracks your estimate. - Self-hosted: confirm rows land with
SELECT count() FROM rum_events WHERE ts >= now() - INTERVAL 1 HOUR, then verify the Grafana panel’s p75 for LCP, INP, and CLS sits within a few percent of the two vendor dashboards. Divergence beyond that usually means mismatched sample rates or a unit bug (CLS is unitless, LCP/INP are milliseconds).
Cross-check the ratings against the Google thresholds: LCP ≤ 2.5 s Good / ≤ 4.0 s Needs Improvement / > 4.0 s Poor; INP ≤ 200 ms Good / ≤ 500 ms NI / > 500 ms Poor; CLS ≤ 0.1 Good / ≤ 0.25 NI / > 0.25 Poor. FCP is Good at ≤ 1.8 s and TTFB at ≤ 800 ms.
Migration notes
- Run a dual-write window. Keep the existing vendor agent live while the self-hosted beacon ships in parallel for at least two full weeks. Reconcile p75 per route before you cut over — a per-route delta under ~5% is your green light.
- Match sample rates exactly during overlap. If Datadog samples 25% of sessions and your Worker samples 25% of beacons, the denominators still differ (session vs pageview). Normalize before comparing, or you’ll chase phantom regressions.
- Migrate dashboards, not just data. Recreate each vendor alert as a Grafana alert on the ClickHouse p75 query before decommissioning, so you never have an alerting gap.
- Watch attribution parity. Vendor INP attribution (target element, input delay) needs the web-vitals attribution build on the self-hosted side; capture
metric.attributionor you’ll lose the debugging context the vendor gave you. - Plan retention up front. Set a ClickHouse TTL that matches or exceeds the vendor retention you’re replacing, so historical comparisons survive the cutover.
Edge cases & gotchas
- Session vs pageview billing are not interchangeable. Datadog bills sessions; New Relic bills ingest; self-hosted “bills” per beacon. A naive cost comparison that uses one denominator for all three will be wrong by a large factor.
- Session Replay is the silent budget killer on Datadog. Leave
sessionReplaySampleRateat 0 during a cost bake-off, or the RUM line is unrepresentative. - New Relic ingest balloons with custom attributes. Each attribute you attach multiplies GB-ingest across every event; audit attributes before pricing.
- Self-hosted INP can under-report without lifecycle flushing. If you flush only on
beforeunload(which Safari treats unreliably) instead ofvisibilitychange/pagehide, you lose the final interaction on bfcache restores. - Residency is set at onboarding for both vendors. Moving a Datadog or New Relic account between US and EU regions is effectively a re-implementation, so decide residency before you sign.
FAQ
Which option supports INP best?
All three capture INP natively today — Datadog RUM and New Relic Browser via their agents, and the self-hosted stack via the web-vitals library’s onINP. For deep attribution (slow interaction target, input delay, presentation delay), both vendors and the web-vitals attribution build expose comparable detail, so INP support is not a differentiator on its own.
At what traffic does self-hosting become cheaper?
The crossover is usually a few million sessions per month. Below that, managed time-to-value typically wins; above it, the flat ~$300–700/month self-hosted infra bill undercuts session- or ingest-based pricing, which scales linearly with traffic.
Can I get raw, per-event RUM data out of Datadog or New Relic?
Partially. Both expose query APIs (Datadog’s RUM search API, New Relic’s NRQL/API export) but they are rate-limited and bounded by retention. Only the self-hosted ClickHouse table gives you unlimited raw events to join against business data.
Do all three satisfy EU data residency?
Datadog and New Relic both offer EU regions chosen at onboarding, which covers most mandates. A self-hosted stack lets you pin storage to any jurisdiction you control, which is the only option when residency requirements exceed what vendor regions provide.
Related
- RUM Vendor Comparison — parent overview of the RUM vendor landscape and evaluation criteria.
- SpeedCurve vs Custom RUM — the build-vs-buy decision framed around SpeedCurve as the managed exemplar.
- Self-Hosted Beacon Collection — designing the ingestion endpoint that powers the self-hosted option here.