R2 — Initial HTML completeness

What it measures

Whether critical content — hours, eligibility, contact, schema, calls-to-action — renders in the HTML the server sends, before any JavaScript runs in the browser.

There are two phases to how a page loads:

Initial HTML — the document the server sends. Visible the instant the response arrives. View-Source in the browser shows this.
JS-injected — content that only appears after JavaScript executes in the browser (API fetches that fire after page load, framework hydration, click-to-load).

Humans with a browser see both phases as one page. Many AI agents only see phase one.

Why it matters

For reach

LLM-based agents (ChatGPT, Perplexity, AI Overviews, voice assistants) typically fetch raw HTML and do not execute JavaScript. If hours, eligibility, or services are JS-injected, agents see none of it.

For access

People on slow connections, in low-bandwidth areas, or using older devices see the initial HTML first. JS-injected content arrives late or not at all.

For trust

Agents cross-check JSON-LD schema against visible content as an anti-cloaking signal. Schema that claims hours while the visible page shows a loading state breaks that trust — the schema gets discounted or ignored.

Public-interest stakes: High. Community food banks, free legal clinics, shelters, and public health services often display hours and eligibility via CMS widgets that load client-side. Agents helping someone in crisis cannot confirm "is this place open right now?" — and route them to an aggregator's stale listing instead.

The stakes

5 All critical content (identity, hours, eligibility, contact, schema) renders in the initial HTML response. Data is fetched server-side from a first-party source.

4 Identity and primary content in initial HTML. Some secondary content (reviews, related services) may lazy-load, but core info is present.

3 Identity in initial HTML, but hours, availability, or eligibility load via JavaScript after page mount.

2 Most content JS-injected. Agent sees a skeleton or loading state on first fetch.

1 Single-page-app shell — empty initial HTML. Agents see nothing useful before JavaScript runs.

The four common failure modes

These are the content types that most often fail R2 in audits:

Hours of operation

Hours fetched from a CMS API on page load. Agent asked "is the food bank open today?" cannot answer.

Eligibility / cost

Income limits or service criteria loaded from a content widget. Agent helping someone in crisis cannot confirm eligibility.

Availability / inventory

Appointment slots or shelter bed counts fetched at render time. Agent cannot route someone to an open spot.

Reviews / ratings

Review widget injected client-side. AggregateRating schema declares a count that cannot be verified against visible content.

How to implement

Step 1: Test what the agent sees today

Use the View Source method to confirm what is and is not in the initial HTML:

Open the page in a browser
Right-click → View Page Source (Cmd+U on Mac). This is the initial HTML — not the rendered DOM in DevTools.
Cmd+F and search for the content you expect: hours, address, service name, price, review text, schema block
If it is in the source: passes R2 for that field. If it is not: it is JS-injected and fails.

Why "View Source," not "Inspect": DevTools Inspect shows the rendered DOM after JavaScript runs — what humans see. View Source shows the raw response — what agents see. They diverge whenever JavaScript injects content.

Step 2: Move data fetching to the server

Whatever framework powers the site, the fix is the same: fetch data on the server and interpolate it into the HTML before the response leaves the server.

Next.js (App Router) Server Components fetch by default; use async server components or generateMetadata

Next.js (Pages Router) getServerSideProps or getStaticProps; never useEffect for critical content

Nuxt 3 useFetch in <script setup> runs server-side during SSR

Astro Frontmatter await fetch() runs at build or request time

SvelteKit load() functions run server-side for SSR

Plain HTML / CMS Render server-side from the CMS data; do not lazy-load via widgets

The anti-pattern is fetching critical data inside useEffect, onMounted, or any browser-lifecycle hook. That code runs in the browser, after JavaScript loads — well after the agent has read the HTML and moved on.

Step 3: Architect the data flow

Server-side fetching only works if the data is available server-side. Common architecture problems:

Third-party widgets (Yelp review widget, Google Reviews badge, Trustpilot embed) inject content client-side by design. They cannot be made server-renderable. The fix is to collect first-party reviews or fetch the third-party data server-side.
CMS preview mode left enabled in production — the bridge JS that lets editors preview changes also defeats SSR. Disable preview mode on production builds.
Live API calls per page render — expensive at scale and often the reason a team chose lazy-loading. Cache server-side instead (ISR, request-level caching, or a periodic sync into a first-party database).

Step 4: Verify with JavaScript disabled

Final check before sign-off:

Open the page in Chrome / Firefox
Open DevTools → Command Palette (Cmd+Shift+P) → "Disable JavaScript"
Reload the page
Confirm all critical content is still visible — hours, eligibility, contact, services, CTAs
Re-enable JavaScript before continuing your audit

Data source rules

Where data comes from determines whether it can be in the initial HTML at all. The recommended pattern for any site pulling data from third-party APIs (Google Places, Yelp Fusion, third-party CMSes):

Third-party API → first-party database (sync) → page render

The third-party API is a sync source, not a render-time dependency. A scheduled job (nightly cron) updates the first-party database. The page renders from the first-party data only — fast, ToS-clean, predictable.

Why this matters for Google Places specifically

Google Places data is licensed by Google, not owned by the site operator. Their Terms of Service impose caching limits that collide with standard SSR / SSG patterns:

Field Cache policy Render implication

Place IDs Indefinite Safe to store

Address, hours, phone Up to 30 days OK for SSG / cached SSR

Reviews and ratings No caching permitted Cannot be in static HTML; fresh fetch per request; must attribute

Photos Per attribution rules Special handling

Reviews and ratings are the pinch point. They cannot legally be cached, so they cannot be in static HTML, so they fail R2 unless fetched fresh server-side per request (expensive) or collected first-party (the recommended path).

Three questions to ask before scoping

When working with a development team or vendor, these surface the data-flow decisions early:

Where does the data live today? Your own database, or pulled live from Google / Yelp / a third party? Determines whether the sync architecture above is new work or already in place.
Where do reviews and ratings come from? If a third-party widget, flag it — surfaces the architectural pinch point.
How often does hours / eligibility / availability change, and how does it get updated? Frequent changes argue for a sync pipeline; rare changes are fine with manual CMS entry.

Real example

A community food bank publishes hours on its website. The hours come from a content management system that renders client-side via a JavaScript widget.

❌ Failing — Hours load via JavaScript

What an agent sees on first fetch (View Source)

<!-- What an agent sees on first fetch -->
<body>
  <div id="app">
    <!-- empty: hours load from API after page mount -->
  </div>
  <script src="/app.js"></script>
</body>

Agent output when asked "is the food bank open today?":
"I could not find current hours for this organization."

✓ Passing — Hours in initial HTML

What an agent sees on first fetch (View Source)

<!-- What an agent sees on first fetch -->
<body>
  <main>
    <h1>Community Food Bank — Springfield</h1>
    <section aria-labelledby="hours-heading">
      <h2 id="hours-heading">Hours</h2>
      <p>Mon-Fri 9am-6pm</p>
      <p>Saturday 9am-2pm</p>
    </section>
    <a href="tel:+12175550100">Call: (217) 555-0100</a>
  </main>
  <script type="application/ld+json">
    { "@context": "https://schema.org",
      "@type": "LocalBusiness",
      "name": "Community Food Bank",
      "openingHoursSpecification": [ ... ] }
  </script>
</body>

Agent output when asked "is the food bank open today?":
"Yes — Community Food Bank is open today 9am-6pm. You can call (217) 555-0100."

What changed

The food bank's CMS already had the hours. The fix was an architectural one, not a content one:

Render the hours block server-side at request time, populated from the CMS via API
Add the JSON-LD openingHoursSpecification block, also populated from the same CMS data
Remove the client-side widget — the same data now appears in initial HTML

FAQ

Does Google still see JS-rendered content? Why does R2 matter if Googlebot runs JavaScript?

Googlebot does render JavaScript — but on a delay (sometimes days) and only when it has rendering budget. JS-rendered content is second-class. More importantly, LLM-based agents (ChatGPT, Perplexity, Claude, voice assistants) typically do not run JavaScript at all. R2 is about agent reach, not just Google ranking.

We use a third-party review widget. Is there any way to keep it and still pass R2?

Reviews are the architectural pinch point. Three options: (1) collect first-party reviews and render server-side — the recommended path, cleanest across ToS, MX, and brand trust; (2) fetch the third-party reviews server-side per request and render with attribution — expensive but possible; (3) keep the JS widget and accept that reviews will not be in initial HTML — fails R2.

Will server-rendering everything make the site slow?

Not if it is architected well. Cached SSR (ISR, request-level caching) gives you the speed of static with the freshness of dynamic. The sites that struggle with SSR performance are usually fetching live from third-party APIs per request — the fix is to cache or sync, not to abandon SSR.

What about content that genuinely needs to be interactive — search, filters, login?

Interactive features can and should be JavaScript-powered. R2 is about critical declarative content — who, what, where, when, how to contact. A search input itself can be a hydrated client component, but the catalog it searches over should be rendered in initial HTML where possible.

My framework supports streaming HTML. Does that pass R2?

Yes — as long as the critical content streams as HTML, not as JavaScript that then renders HTML. React Server Components and similar streaming approaches send progressively-complete HTML to the browser. Agents see the streamed HTML as it arrives. Streaming a "loading..." placeholder followed by client-side hydration of real content is the failure mode.

How do I test this on a site I do not own?

Three quick tests: (1) curl https://the-site.com — what comes back is what an agent sees; (2) browser View Source then Cmd+F for critical content; (3) DevTools → Disable JavaScript → reload — what remains visible is what most agents will read.

Checklist

Use this to confirm R2 is implemented before moving on.

View Source on the live page shows hours, address, and phone in the raw HTML View Source shows the JSON-LD <script type="application/ld+json"> block in <head> Disabling JavaScript in the browser still shows critical content Data fetching for the page happens on the server (SSR, SSG, ISR, or RSC) — not in onMounted / useEffect If using Google Places API: data flows Google -> first-party DB (sync) -> page render — not Google -> page render direct Reviews and ratings displayed visibly match the AggregateRating count declared in schema No third-party review widgets (Yelp embed, Google Reviews widget) carrying schema-critical content Tested with a JS-disabled fetch (curl https://your-site.com or browser DevTools "Disable JavaScript")

← Back to all guides Next: Implementing JSON-LD at scale → Full methodology reference