INP Tracking & Debugging

Interaction to Next Paint (INP) is the responsiveness metric that replaced First Input Delay in March 2024, and it is the hardest Core Web Vital to instrument well because it observes the entire session, not a single load-phase event. As established in Core Web Vitals & Performance Metrics Fundamentals, field data is the only honest source of truth here: a synthetic run captures one scripted click, while INP reports the worst interaction a real user actually felt across hundreds of taps, key presses, and clicks. This page covers the full production loop — capturing event-timing entries with a PerformanceObserver, attributing slow interactions to their three constituent phases, segmenting field data at p75, and shipping fixes that break up the long tasks saturating the main thread.

INP supersedes FID for a structural reason. FID measured only input delay on the first interaction — the gap between the user’s tap and the moment the event handler began. That number was almost always good because handlers start quickly; it said nothing about whether the handler then blocked the main thread for 400ms or whether the resulting DOM mutation took three frames to paint. INP closes that gap by measuring the full interaction, from input to the next paint that reflects it, across every interaction in the session.

Anatomy of an INP interaction A timeline from user tap to next paint, split into input delay, processing time, and presentation delay, with the long task that inflates each phase. tap paint Input delay main thread busy Processing event handlers run Presentation layout + paint Long task > 50ms blocks the thread INP = input delay + processing + presentation reported as the worst interaction at p75 across the session
Every INP interaction decomposes into three phases; a long task inflates whichever phase it overlaps. The web-vitals attribution build exposes this breakdown per interaction.

Threshold configuration

Google’s INP thresholds are fixed and apply at p75 across the sampled population. Treat them as the alerting bands for your dashboards, and map each band to a concrete engineering response rather than a generic “investigate”.

Rating INP at p75 What it means Engineering action
Good ≤ 200 ms Interactions feel instant Hold the line; gate regressions in CI
Needs Improvement 201–500 ms Perceptible lag on the worst interactions Profile the slowest interaction types; break up long tasks
Poor > 500 ms Visible jank; users perceive the page as broken Treat as an incident; trace processing vs presentation split

The headline number is always the p75 of the per-page-view INP values — never a mean, which a handful of fast interactions will flatter into hiding real pain. Note that INP’s own per-interaction selection is already a partial-outlier filter: web-vitals reports the highest-latency interaction for most sessions, but for sessions with many interactions it discards the worst few (one per ~50 interactions) to avoid a single thermal-throttle stall dominating the page’s score.

Measurement implementation

INP is measured from the event entry type of the PerformanceObserver API. Unlike LCP or CLS, event-timing entries are not reported by default for short events — you must pass a durationThreshold, and you should request buffered: true plus a separate first-input observer so you never miss the first interaction (which fires before your observer is wired up). The lowest durationThreshold the spec honours is 16ms; setting it to 16 captures essentially every interaction worth scoring.

// Raw event-timing capture — the mechanism web-vitals wraps for you.
const interactions = new Map();

function recordInteraction(entry) {
  // interactionId groups the pointerdown/pointerup/click of one logical tap.
  if (!entry.interactionId) return;
  const existing = interactions.get(entry.interactionId);
  if (!existing || entry.duration > existing.duration) {
    interactions.set(entry.interactionId, entry);
  }
}

// Catch the very first interaction, which may precede observer setup.
new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) recordInteraction(entry);
}).observe({ type: 'first-input', buffered: true });

// Catch every subsequent interaction. durationThreshold:16 is the floor.
new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) recordInteraction(entry);
}).observe({ type: 'event', durationThreshold: 16, buffered: true });

In production you should not hand-roll the INP selection algorithm. The web-vitals attribution build does the interaction grouping, the worst-of-N selection, and — critically — the phase attribution. Wire onINP to a buffered telemetry dispatch that finalises on visibilitychange and pagehide, the only events that reliably fire when a tab is backgrounded or closed on mobile.

import { onINP } from 'web-vitals/attribution';

const ENDPOINT = '/api/v1/metrics';
let buffer = [];

function flush() {
  if (buffer.length === 0) return;
  // sendBeacon survives unload; see the beacon collection cluster for the endpoint side.
  navigator.sendBeacon(ENDPOINT, JSON.stringify({ metrics: buffer }));
  buffer = [];
}

onINP((metric) => {
  // Drop bots and backgrounded captures before they pollute percentiles.
  if (navigator.webdriver || document.visibilityState === 'hidden') return;

  const a = metric.attribution; // present only in the attribution build
  buffer.push({
    value: metric.value,
    rating: metric.rating,            // 'good' | 'needs-improvement' | 'poor'
    interaction_type: a.interactionType,        // 'pointer' | 'keyboard'
    target: a.interactionTarget,                // CSS selector of the element
    input_delay: a.inputDelay,
    processing_duration: a.processingDuration,
    presentation_delay: a.presentationDelay,
    loaf_scripts: (a.longAnimationFrameEntries ?? []).length,
    nav_type: metric.navigationType,            // 'navigate' | 'back-forward' | ...
    device_tier: navigator.hardwareConcurrency <= 4 ? 'low' : 'high',
  });
}, { reportAllChanges: false });

// Finalise on the events that actually fire when a mobile tab dies.
addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'hidden') flush();
});
addEventListener('pagehide', flush);

The three attribution fields — inputDelay, processingDuration, presentationDelay — are the entire game. They tell you which phase to fix before you ever open DevTools, and they map directly to the timeline diagram above. A high inputDelay means the main thread was already busy when the user tapped (a long task from elsewhere); a high processingDuration means your handler is doing too much synchronous work; a high presentationDelay points at layout thrash or oversized DOM updates, which often correlates with CLS Reduction Strategies work since both stem from expensive style and layout recalculation.

Debugging workflow

When p75 INP crosses into Needs Improvement, run a repeatable workflow rather than guessing. The attribution fields collected above turn this from a hunt into a lookup.

  1. Identify the worst interaction class. Group field INP by target selector and interaction_type, then rank by p75. One or two component selectors usually dominate. Filter to the device tiers where the regression actually appears.
  2. Split by phase. For the worst class, compute the p75 of input_delay, processing_duration, and presentation_delay separately. The dominant phase tells you what kind of fix you need — yield work, shrink the handler, or reduce the render.
  3. Trace the waterfall. Reproduce locally and record a Performance trace. If input_delay dominates, look for a long task before the input timestamp — frequently a third-party script or a hydration burst. The Long Animation Frames (LoAF) entries in the attribution payload point straight at the script and source location.
  4. Correlate overlaps. Check whether the slow interaction overlaps a layout shift or an LCP candidate render. Interactions during page settle often inherit delay from unrelated load-phase work; this is the overlap between INP and LCP Measurement & Optimization.
  5. Validate the fix in the lab. Confirm the local trace shows the long task broken up and the handler shortened, then re-record to verify the next-paint moves earlier.
  6. Deploy and monitor the delta. Ship behind a flag, then watch the p75 INP delta in field data for the affected segment over the following days. Lab improvements that do not move the field p75 mean you fixed the wrong interaction.

For the deeper triage of intermittent, hard-to-reproduce regressions — session-scoped attribution capture, LoAF stack collection, and isolating which deploy moved the number — see Debugging INP Spikes in Production Environments.

Field-data analysis patterns

INP is unusually sensitive to who the user is and what device they hold, so an aggregate p75 hides the real story. Always segment before you draw conclusions, because the metric is dominated by the slowest cohort of cheap hardware.

Segment axis What divergence to watch Likely cause
Device class Low-tier mobile p75 2–4× desktop Main-thread saturation; CPU-bound processing phase
Network type Slow 3G/4G inflates inputDelay early in session Late-arriving scripts blocking the thread during settle
Geography One region far worse at equal device tier Region-specific third-party tags or edge latency
Interaction type keyboard worse than pointer Synchronous input handlers / uncontrolled React inputs
Navigation type back-forward better than navigate bfcache restores skip the costly hydration path

The pattern that catches teams out: a desktop-heavy analytics view shows a perfectly Good INP while the field p75 sits in Needs Improvement. That divergence is almost always low-tier Android, because the main thread there is slow enough that the same handler that finishes in 30ms on a laptop takes 250ms. Weight your dashboards by your real traffic mix, and always read the headline at p75 of the sampled population rather than a convenience subset.

Optimization strategies

Every INP fix reduces to one of three moves: shorten the input delay by clearing long tasks off the thread, shorten processing by yielding, or shorten presentation by rendering less. The highest-leverage technique is breaking up long tasks so the browser can service input between chunks.

scheduler.yield() is the modern primitive: it yields to the browser but returns ahead of other queued tasks, so your continuation is not starved the way a setTimeout(0) continuation can be. It is available in recent Chromium; everywhere else you fall back to a setTimeout macrotask, accepting that the continuation goes to the back of the queue.

// Yield between chunks so pending input can be serviced mid-computation.
async function yieldToMain() {
  if ('scheduler' in window && 'yield' in scheduler) {
    return scheduler.yield();
  }
  return new Promise((resolve) => setTimeout(resolve, 0));
}

async function processInChunks(items, work) {
  let lastYield = performance.now();
  for (const item of items) {
    work(item);
    // Yield only when we have held the thread past the 50ms long-task budget.
    if (performance.now() - lastYield > 50) {
      // isInputPending lets us yield eagerly when a user is actually waiting.
      const scheduling = navigator.scheduling;
      if (scheduling?.isInputPending?.() ?? true) {
        await yieldToMain();
        lastYield = performance.now();
      }
    }
  }
}

For genuinely CPU-bound work — parsing, sorting, diffing large datasets — yielding only spreads the pain; the right move is a Web Worker so the computation never touches the main thread at all. Keep the worker boundary coarse: post the whole job and receive the result, rather than chatting per item, since structured-clone serialization across the boundary is itself a cost.

// Offload heavy work entirely; the main thread stays free for input.
const worker = new Worker(new URL('./inp-worker.js', import.meta.url), { type: 'module' });

document.querySelector('#run-report').addEventListener('click', () => {
  worker.postMessage({ rows: largeDataset });
});
worker.addEventListener('message', (e) => {
  // Paint the result; the click handler returned in microseconds.
  renderReport(e.data.result);
});

The measured impact is concrete. A search-as-you-type handler that ran a synchronous filter over 8,000 rows produced a 380ms processing phase on mid-tier Android. Moving the filter into chunked work with scheduler.yield() cut the per-interaction processing to under 50ms because the browser repainted the input value between chunks; field p75 INP for that component dropped from 410ms to 180ms. The detailed before/after walkthrough lives in Breaking Up Long Tasks with scheduler.yield.

Failure modes & gotchas

INP instrumentation fails quietly more often than it errors loudly. These are the traps that produce wrong numbers rather than exceptions.

  • SPA route-transition accumulation. INP is reported per page view, not per hard navigation. In a client-routed app you must call web-vitals’ soft-navigation handling (or reset your own per-route accumulator), or a single long-lived SPA session will report one inflated INP spanning every route the user visited. Without that, your worst route poisons every other route’s number.
  • No event timing in Safari and Firefox. The Event Timing API that backs INP is Chromium-only at production scale. Safari and Firefox do not emit the event entries, so onINP simply never fires there. Your field INP is a Chromium metric; do not read its absence as “fast” — it is “unmeasured”. Account for this in your traffic-mix weighting.
  • Background-tab suspension. Interactions and paints in a hidden tab are not representative, and a tab restored from the background can register a giant artificial delay. The capture guard above (document.visibilityState === 'hidden') drops these, and finalising on visibilitychange ensures the real last interaction is sent before suspension.
  • Outliers from thermal throttling. A single overheating device can emit a 2-second INP. The per-session worst-of-N discard inside web-vitals absorbs most of this, but verify your aggregation reads p75 and not max, or one bad device defines your dashboard.
  • durationThreshold set too high. If you raise durationThreshold above 16 to cut volume, you silently exclude faster-but-still-poor interactions and bias the metric optimistic. Sample by session, not by raising the duration floor.

CI/CD integration

INP cannot be asserted from a synthetic load test the way LCP can, because it requires interaction. Gate it two ways. In the lab, drive a scripted interaction (Playwright page.click on the known-slow component) and assert the Total Blocking Time and longest-task duration stay under budget — TBT is the best lab proxy for INP regressions. In the field, gate on the delta: after each release, compare the new build’s p75 INP for the top interaction targets against the previous build’s baseline and fail the rollout if it regresses past a threshold.

# Lab gate: fail the build if the scripted interaction blocks the thread too long.
npx playwright test inp-budget.spec.ts
# inside the test, assert longest-task < 200ms via the Long Animation Frames API.

# Field gate: compare new-build p75 INP against baseline from the RUM store.
node ./ci/check-inp-regression.js --build "$GIT_SHA" --baseline-days 7 --max-delta-ms 30

Pair the lab proxy with the field delta and you catch both the obvious regressions (a new synchronous handler) before merge and the subtle ones (a third-party tag that only inflates inputDelay on real devices) within a day of release.

FAQ

Why did INP replace FID?

FID only measured the input delay of the first interaction — the time before the handler started — which was almost always good and ignored handler execution and rendering. INP measures the full interaction (input delay + processing + presentation) across the whole session, so it captures the slow handlers and janky paints that FID missed. The switch became official in March 2024.

What is the INP “Good” threshold?

INP is Good at ≤ 200 ms, Needs Improvement at 201–500 ms, and Poor above 500 ms, measured at p75 across your sampled page views. Always report the p75, never a mean — a few fast interactions will otherwise mask the worst ones users actually feel.

Why is my field INP missing for some users?

The Event Timing API that powers INP ships in Chromium only. Safari and Firefox do not emit event entries, so onINP never fires there and those sessions contribute no INP. Treat missing data as unmeasured, not fast, and weight dashboards by your real browser mix.

Does scheduler.yield() actually lower INP?

Yes, when the bottleneck is the processing phase. Yielding mid-computation lets the browser service pending input and repaint between chunks, so the user-perceived interaction completes quickly even though total work is unchanged. For purely CPU-bound work, a Web Worker is better because it keeps the main thread entirely free.

How do I stop a single SPA session from inflating INP?

INP is per page view. In a client-routed app, enable web-vitals soft-navigation handling or reset your own accumulator on each route change; otherwise one long-lived session reports a single INP spanning every route, and the worst route poisons all the others.