Web Vitals API Implementation
Every field-data program lives or dies on the instrumentation layer that captures Core Web Vitals in the wild and ships them off the page before the user navigates away. This is the canonical reference for that layer — the web-vitals npm library and the browser primitives it wraps, the lifecycle that decides when a metric is final, and the transport that survives an unload. As established in Core Web Vitals & Performance Metrics Fundamentals, the metrics themselves are only useful if they are measured the same way Chrome’s tooling measures them; an instrumentation that diverges from the spec produces dashboards that disagree with the Chrome User Experience Report and erodes every downstream decision. The goal here is a single, defensible collection path: correct observer registration, correct finalization, and a beacon that does not get dropped.
The hard part is not calling onLCP. The hard part is the lifecycle — knowing that PerformanceObserver must register with buffered: true to catch entries that fired before your script ran, that the largest contentful paint value keeps growing until the first interaction, that interaction to next paint is not known until the page is hidden, and that the only reliable moment to send is visibilitychange to hidden. Get those wrong and you will under-report INP, double-count navigations, and lose 10-30% of beacons on mobile. This page treats the library as the default and the raw observer as the escape hatch you reach for only when debugging.
Threshold configuration and the rating each metric reports
The library emits a rating field of good, needs-improvement, or poor on every metric object. Those ratings are not advisory — they are computed against Google’s published thresholds, and your ingestion schema should treat them as authoritative rather than re-deriving them server-side with stale cutoffs. The table below is the source of truth that the rating field encodes, plus the engineering action each band should trigger in your pipeline.
| Metric | Good | Needs Improvement | Poor | Engineering action when p75 lands in NI/Poor |
|---|---|---|---|---|
| LCP | ≤ 2.5 s | ≤ 4.0 s | > 4.0 s | Inspect attribution.lcpEntry, prioritize the hero resource, audit TTFB share |
| INP | ≤ 200 ms | ≤ 500 ms | > 500 ms | Break the slow interactionTarget’s handler with scheduler.yield, reduce input delay |
| CLS | ≤ 0.1 | ≤ 0.25 | > 0.25 | Map attribution.largestShiftTarget, reserve space, fix late-injected content |
| FCP | ≤ 1.8 s | ≤ 3.0 s | > 3.0 s | Reduce render-blocking CSS/JS, improve server response |
| TTFB | ≤ 800 ms | ≤ 1.8 s | > 1.8 s | Edge-cache, cut origin latency, inspect redirect chains |
A subtlety that trips teams: these are p75 thresholds applied to the distribution, not a per-beacon pass/fail. A single LCP of 3.1 s is rated needs-improvement on its own metric object, but the page only “fails LCP” when the p75 of the cohort exceeds 2.5 s. Keep the per-event rating for triage and compute the p75 verdict in the warehouse.
The callback and observer parameter reference
This is the table to bookmark. Each web-vitals callback wraps one or more PerformanceObserver entry types, fires at a specific lifecycle moment, and finalizes under specific conditions. Mismatches between what you assume here and what the browser actually does are the root cause of nearly every “our numbers don’t match CrUX” investigation.
| Callback | Underlying observer type(s) | When it fires | Finalization condition |
|---|---|---|---|
onLCP |
largest-contentful-paint |
Once, with the final largest paint | First user interaction (keydown/pointerdown) or page hidden, whichever comes first |
onINP |
event, first-input |
Once, on the worst qualifying interaction | Page hidden (visibilitychange/pagehide) — INP is unknown until then |
onCLS |
layout-shift |
Once, with the largest session window | Page hidden; CLS accumulates across the whole visit in session windows |
onFCP |
paint (first-contentful-paint) |
Once, at first contentful paint | Reported as soon as the paint entry is observed |
onTTFB |
navigation |
Once, early in the load | Reported after responseStart is available |
Two rules govern all of them. First, every observer the library creates registers with buffered: true, so entries that the browser recorded before your script executed are replayed into the callback — this is why deferring the web-vitals import does not lose the LCP that already happened. Second, LCP, INP, and CLS are only correct at page hide; if you read them on a timer you capture a premature value. The reportAllChanges option exists precisely to let you see the intermediate values during development without changing when the canonical value is sent.
reportAllChanges semantics
With reportAllChanges: false (the default), each callback fires exactly once per page with the finalized value. This is what production wants: one beacon per metric per navigation, no double-counting. With reportAllChanges: true, the callback fires on every change — every larger LCP candidate, every worse interaction, every new CLS session-window maximum. That is invaluable for live debugging in the console but catastrophic for a beacon pipeline, because you will emit a beacon per change and your p75 will be skewed by the intermediate values. Use true only behind a debug flag.
import { onLCP } from 'web-vitals';
// Production: one final value
onLCP((metric) => sendBeacon(metric), { reportAllChanges: false });
// Local debugging only: watch the value grow
if (location.search.includes('wv-debug')) {
onLCP((metric) => {
console.log(`LCP candidate: ${metric.value.toFixed(0)}ms`, metric.entries.at(-1));
}, { reportAllChanges: true });
}
Production measurement implementation
Below is a complete, runnable collection module. It registers all five metrics, accumulates them into a single per-navigation queue, and flushes that queue exactly once when the page is hidden. The single-flush-on-hidden pattern is the most important detail on this page: it is the only event that fires reliably across desktop close, mobile tab-switch, and bfcache eviction.
import { onLCP, onINP, onCLS, onFCP, onTTFB } from 'web-vitals';
const ENDPOINT = '/api/telemetry/web-vitals';
const navigationId = crypto.randomUUID();
const queue = new Map();
function record(metric) {
// Keyed by name so a late, larger value overwrites the earlier one
queue.set(metric.name, {
name: metric.name,
value: metric.value,
rating: metric.rating,
delta: metric.delta,
id: metric.id,
navigationType: metric.navigationType,
navigationId,
});
}
function flush() {
if (queue.size === 0) return;
const body = JSON.stringify({
page: location.pathname,
metrics: [...queue.values()],
});
queue.clear();
// Primary transport: survives unload, does not block.
if (navigator.sendBeacon && navigator.sendBeacon(ENDPOINT, body)) return;
// Fallback for engines without a working sendBeacon for this payload.
fetch(ENDPOINT, { method: 'POST', body, keepalive: true }).catch(() => {});
}
onLCP(record);
onINP(record);
onCLS(record);
onFCP(record);
onTTFB(record);
// Flush exactly once, on the first reliable terminal signal.
let flushed = false;
function flushOnce() {
if (flushed) return;
flushed = true;
flush();
}
addEventListener('visibilitychange', () => {
if (document.visibilityState === 'hidden') flushOnce();
});
// pagehide covers iOS Safari, which historically did not always fire
// visibilitychange on terminal navigation.
addEventListener('pagehide', flushOnce);
The pagehide listener is not redundant. Historically iOS Safari did not reliably fire visibilitychange on a terminal navigation, so pagehide is the safety net. Guarding with flushed prevents the double-send when both fire. Note that the metric delta is carried in the payload — for CLS and INP, the library reports the metric once but delta lets a more advanced collector reconstruct incremental change if you later opt into reportAllChanges.
Finalizing on visibilitychange and the bfcache restore
When a page is restored from the back/forward cache, it is a new measurement context but not a new document load. The web-vitals library handles this internally: it resets its metric state on pageshow when event.persisted is true and begins reporting again. Your code must follow suit — if you cache navigationId at module load, every bfcache restore will reuse the stale id and your beacons will collide. Regenerate per restore:
let navId = crypto.randomUUID();
addEventListener('pageshow', (event) => {
if (event.persisted) {
navId = crypto.randomUUID(); // new id for the restored visit
flushed = false; // allow the restored visit to flush
}
});
The attribution build
The default build gives you a number; the attribution build tells you why the number is what it is. Import from web-vitals/attribution and every metric gains an attribution object pointing at the responsible DOM element, timing breakdown, and source entry. This is the difference between a dashboard that says “INP is 480 ms” and one that says “INP is 480 ms, the offending element is button.add-to-cart, and 410 ms of it is processing time.” The deep methodology lives in Debugging Web Vitals with the Attribution Build; the essentials follow.
import { onLCP, onINP, onCLS } from 'web-vitals/attribution';
onLCP((metric) => {
const a = metric.attribution;
record({
...metric,
lcpElement: a.target, // CSS selector of the LCP element
lcpUrl: a.url, // the resource URL, if image
lcpTtfb: a.timeToFirstByte,
lcpLoadDelay: a.resourceLoadDelay,
lcpRenderDelay: a.elementRenderDelay,
});
});
onINP((metric) => {
const a = metric.attribution;
record({
...metric,
inpTarget: a.interactionTarget, // selector of the interacted element
inpType: a.interactionType, // 'pointer' | 'keyboard'
inputDelay: a.inputDelay,
processingDuration: a.processingDuration,
presentationDelay: a.presentationDelay,
});
});
onCLS((metric) => {
const a = metric.attribution;
record({
...metric,
shiftTarget: a.largestShiftTarget, // selector of the biggest shift source
shiftTime: a.largestShiftTime,
});
});
The attribution build is roughly 1-2 KB larger gzipped than the standard build. That cost is trivial against the diagnostic value; ship it in production. The one caveat is that attribution.target is computed lazily and references elements that may already be detached from the DOM by the time you serialize — never store the raw element reference, always store the selector string the library has already resolved for you.
Reliable transport via sendBeacon
A perfectly measured metric that never reaches your endpoint is worthless, and the unload moment is exactly when normal fetch calls get cancelled. navigator.sendBeacon() exists for this: it queues the request in the browser’s network stack and lets the document die without aborting it. It returns a boolean — true means queued, false means the user agent refused (typically because the payload exceeded the per-origin beacon quota). The fallback in the implementation above handles that false by retrying with fetch(..., { keepalive: true }), which has the same survive-unload guarantee but a smaller size ceiling and broader edge-case support.
Three constraints govern the payload. The total in-flight keepalive body across all requests is capped at 64 KB per the Fetch spec, so batch conservatively — one beacon per navigation carrying five metrics is well within budget; a beacon per reportAllChanges event is not. Second, sendBeacon sends a text/plain content type by default unless you wrap the body in a Blob with an explicit type, which matters for CORS preflight on cross-origin endpoints. Third, the receiving endpoint should answer 204 No Content and do its real work asynchronously — the browser does not wait for or read the response. The full server contract, batching, and compression are covered in self-hosted beacon collection.
Step-by-step debugging workflow
When field numbers look wrong, work this sequence rather than guessing.
- Identify the suspect metric and segment. Pull p75 per metric from the warehouse and find which one diverges from CrUX or from your lab baseline. Divergence on INP almost always means a finalization bug; divergence on LCP often means an unbuffered observer.
- Trace the capture in the browser. Open the page with the
wv-debugflag andreportAllChanges: true, then watch the console as you load and interact. Confirm each metric fires and the final value matches DevTools’ Performance panel. - Correlate attribution against the live DOM. For the slow metric, read
attribution.target/interactionTargetand confirm the selector resolves to the element you expect. A null or detached target signals dynamic content racing the callback. - Validate in the lab. Reproduce under throttled CPU and network in DevTools or WebPageTest so you can attach a profiler to the offending interaction or paint without field noise.
- Deploy the fix behind a flag. Ship the optimization to a cohort and keep the old path live for comparison.
- Monitor the p75 delta. Watch the cohort’s p75 move over the following days; field metrics are noisy, so require a sustained shift, not a single-day dip, before declaring victory.
Field-data analysis patterns
Aggregate p75 hides the failures you most need to find. Always segment, because a healthy global p75 routinely masks a broken cohort.
- Device class. Low-end Android dominates INP regressions; the same handler that is 80 ms on a desktop is 350 ms on a budget phone. Segment by an inferred device tier (derived from
navigator.hardwareConcurrencyanddeviceMemory) and track p75 per tier. - Network type. TTFB and LCP track
effectiveConnectionType. Aslow-2gcohort with poor LCP is a different problem than a4gcohort with poor LCP — the former is delivery, the latter is render. - Geography. TTFB divergence by region is almost always an edge-caching gap. Map p75 TTFB per country and look for the regions far from your origin.
- Navigation type. Isolate
navigate,reload,back-forward, andprerender. Aback-forwardcohort with great LCP and anavigatecohort with poor LCP confirms the bfcache is working and the cold path is the problem.
The divergence to watch hardest is when one segment’s p75 lands a full rating band away from the aggregate — that is where a real user population is suffering inside a green headline number. Decisions about how finely you can segment depend on your sampling strategy and p75 sample sizes; thin cohorts produce unstable percentiles.
Cross-browser failure modes and gotchas
- Safari PerformanceObserver gaps. WebKit historically did not implement
largest-contentful-paint,layout-shift, oreventtiming, so Safari emits no LCP, CLS, or INP. The web-vitals library feature-detects viaPerformanceObserver.supportedEntryTypesand silently skips unsupported metrics — your pipeline must treat absent Safari LCP as “not measured,” never as zero, or you will poison the p75. - bfcache restores. Covered above: a restore is a new measurement but the document object persists. Failing to reset
navigationIdand the flush guard produces collided or missing beacons for the second visit. - Double-counting. The classic bug is registering callbacks twice (e.g., a bundle loaded by two code paths during a framework migration) or flushing on both
visibilitychangeandpagehidewithout theflushedguard. The symptom is a p75 that looks plausible but a beacon volume roughly double your page-view count. - Unbuffered observers. Hand-rolling
new PerformanceObserver(cb).observe({ type: 'largest-contentful-paint' })withoutbuffered: trueloses every entry that fired before the script ran — which for LCP is usually all of them. This is the single most common reason a custom observer “reports nothing.” - Cross-origin attribution. For cross-origin LCP images without
Timing-Allow-Origin, the resource timing is opaque andattribution.resourceLoadDelaywill be unavailable. Add the header on your CDN if you serve hero images cross-origin.
CI/CD gating
Field data confirms regressions after they ship; a lab gate stops them before. Run a synthetic Web Vitals check in CI against a representative page and fail the build when a metric crosses its threshold or regresses against the stored baseline beyond a tolerance. Keep the gate on lab values (deterministic) while monitoring field p75 (authoritative) — the two are complementary, not redundant.
#!/usr/bin/env bash
set -euo pipefail
# Collect a Lighthouse run as JSON, then assert thresholds on the metrics.
npx lighthouse "$DEPLOY_PREVIEW_URL" \
--only-categories=performance \
--output=json --output-path=./lh.json --quiet --chrome-flags="--headless"
node -e '
const r = require("./lh.json").audits;
const lcp = r["largest-contentful-paint"].numericValue;
const cls = r["cumulative-layout-shift"].numericValue;
const tbt = r["total-blocking-time"].numericValue; // INP proxy in lab
const fail = [];
if (lcp > 2500) fail.push(`LCP ${Math.round(lcp)}ms > 2500ms`);
if (cls > 0.1) fail.push(`CLS ${cls.toFixed(3)} > 0.1`);
if (tbt > 200) fail.push(`TBT ${Math.round(tbt)}ms > 200ms`);
if (fail.length) { console.error("Web Vitals gate failed:\n " + fail.join("\n ")); process.exit(1); }
console.log("Web Vitals gate passed.");
'
Total Blocking Time is the lab proxy for INP — there is no lab INP, since INP needs real interactions — so gate on TBT in CI and validate the real INP in the field. Tighten the gate gradually: start it as a non-blocking warning, establish a stable baseline, then promote it to a hard failure once the team trusts the signal.
FAQ
Why does INP only report when the page is hidden?
INP is the worst (near-worst) interaction across the entire visit, so its value is genuinely unknown until the visit ends. The library accumulates interaction latencies and reports the final INP on visibilitychange to hidden or pagehide. Reading INP on a timer gives you a premature, optimistically low value.
Should I use the standard build or the attribution build in production?
Use the attribution build. It costs roughly 1-2 KB gzipped over the standard build and turns every metric into an actionable diagnostic by pointing at the responsible element and timing breakdown. The size cost is negligible relative to the debugging time it saves.
Why register PerformanceObserver with buffered: true?
Performance entries — especially the LCP and paint entries — frequently fire before your instrumentation script executes. buffered: true replays those already-recorded entries into your callback, so deferring the web-vitals import does not lose the metrics that happened during early load. Without it, a custom observer typically reports nothing.
sendBeacon returned false — what do I do?
false means the browser refused to queue the beacon, usually because the in-flight payload exceeded the per-origin quota (64 KB across keepalive requests). Fall back to fetch(url, { keepalive: true }), and reduce payload size by batching one beacon per navigation rather than per metric change.
Why are my field numbers double the page-view count?
You are almost certainly double-counting: callbacks registered twice (common during a bundler migration) or a flush firing on both visibilitychange and pagehide without a guard. Add a single flushed boolean and ensure the web-vitals module is imported exactly once.
Related
- Using the web-vitals npm Library Correctly — version pinning, duplicate-emission pitfalls, and correct callback wiring.
- Debugging Web Vitals with the Attribution Build — reading attribution objects to trace each metric to its root cause.
- INP Tracking & Debugging — the interaction phases and how to break up the work that drives INP.
- Self-Hosted Beacon Collection — the ingestion endpoint, batching, and validation that receive these beacons.
- Framework Performance Instrumentation — wiring this capture path into Next.js, React, and Vue lifecycles.