MLI Methodology — MX for Good

Scoring model

Per criterion

Each criterion is scored on a 1–5 rubric defined in its section below.

5 Fully implemented; matches best practice

4 Mostly implemented; minor gaps

3 Partially implemented

2 Minimally implemented

1 Not implemented

A criterion may be skipped when its underlying feature doesn't apply to the page (no actionable forms on a static page; no time-sensitive content on an evergreen explainer). Skipped criteria are excluded from the pillar average — they don't default to 3, and they produce no findings.

Per pillar

Each pillar score is the average of its scored criteria, normalized to 0–100:

pillar_score = (sum_of_criterion_scores / (count_of_scored_criteria × 5)) × 100

Total MLI

The total MLI is the average of the four pillar scores, range 0–100:

MLI = (Identity + Reachability + Structure + Currency) / 4

A reader who sees Identity 75, Reachability 50, Structure 80, Currency 60 can see why the total is 66 and which pillar is dragging it down.

Bands

80–100

In the room

Agents can find, parse, trust, and act on this site reliably.

60–79

Partially audible

Agents surface this site sometimes; some claims are illegible.

40–59

Hard to surface

Agents can find the site but cannot reliably parse or act on it.

0–39

Not in the conversation

Agents cannot meaningfully read this site; another voice answers in its place.

Pillar 1

Identity

Who the site says it is. Whether the organization, its services, and its authority are declared in machine-readable form — or left for an agent to infer from third-party mentions.

I1 — Organization identity declaration

What it measures: Whether the organization is declared in machine-readable form via JSON-LD Organization or LocalBusiness schema, with verified identity links.

Why it matters: Without an Organization declaration, an agent infers who you are from third-party mentions — Yelp, Google Business, social platforms, news coverage. Other voices speak for you. With it, you author your own first-party identity claim.

Signals

JSON-LD contains Organization, LocalBusiness, NGO, or another organization subtype
Properties present: name, url, logo, sameAs, description, address
sameAs points to verified profiles or official records

Rubric

5 Organization/LocalBusiness/NGO with name, url, logo, description, and sameAs to multiple verified profiles

4 Organization with name, url, and sameAs to at least one verified profile

3 Organization present but minimal (name and url only)

2 Generic Organization without sameAs or logo

1 No Organization schema

I2 — Service-type specificity Very high public-interest stakes

What it measures: Whether the page declares the correct schema.org service-type for what the organization actually does — not just generic Organization.

Why it matters: An agent answering "where can I get free legal help in Maryland in Spanish" needs to filter on service type. Generic Organization schema offers no service-type axis to filter on. LegalService does. The same applies to GovernmentService, SocialService, MedicalClinic, and dozens of other subtypes. Without a specific service-type schema, clinics rank below service-typed competitors in service-type-specific filters, and sit in the long tail of undifferentiated generic Organization results — making them harder to surface in agent searches.

The same mechanism shapes commercial discovery. An agent answering "find me a same-day plumber in Oakland" needs Plumber or HomeAndConstructionBusiness — not a generic Organization. Sites that ship the right subtype rank above competitors that don't, in any service-typed query.

Signals

jsonLdTypes includes a service-specific type appropriate to the page content
Required properties for that subtype (e.g., LegalService should have serviceType, areaServed, provider)

Rubric

5 Specific service-type schema with required properties (areaServed, availableLanguage, audience, etc.)

4 Specific service-type with name and description

3 Specific service-type minimal (type declared, properties sparse)

2 Generic Organization where a specific subtype clearly applies

1 No service-type schema, or wrong type

I3 — Source authority and credentials Very high public-interest stakes

What it measures: Whether the site declares the authority signals that distinguish authoritative voices from predatory ones — accreditation, parent organization, professional credentials, government TLD, free-access status.

Why it matters: Machine legibility alone doesn't distinguish a BIA-accredited immigration legal services organization from a notario. (Notarios are unlicensed practitioners who exploit a name collision with Latin American notarios públicos to advertise legal services they aren't qualified to provide.) Commercial immigration service providers — both licensed and unlicensed — invest in search marketing in ways that underfunded community legal aid organizations typically cannot. Source-authority signals let agents and agent users tell the difference.

Signals

Schema for hasCredential, accreditedBy, parentOrganization, funder, award
isAccessibleForFree: true where applicable
TLD signals (.gov, .edu, .org consistent with mission)
HTML links to accrediting bodies (BIA, ABA, state bars, accreditation councils)
nonprofitStatus for tax-exempt orgs

Rubric

5 hasCredential with verified accrediting body, parentOrganization where applicable, isAccessibleForFree where applicable, appropriate TLD

4 Some authority signals in schema (e.g., accreditation in JSON-LD + link to accrediting body)

3 Authority claims in HTML but not in structured data

2 Generic professional signals only (privacy policy, address, no credentials)

1 No authority signals

Pillar 2

Reachability

Whether an agent can fetch the site and parse its content. Whether the site declares a contract for AI access, serves content in initial HTML, and reaches users in their language.

R1 — Agent contract

What it measures: Whether the site declares an explicit contract for AI agents and crawlers.

Why it matters: Without robots.txt agent directives, an agent has no signal whether it's welcome, what it can fetch, and what's off-limits. llms.txt provides an AI-readable site index. sitemap.xml aids discovery. These are the most direct findability signals and are missing from most under-resourced sites.

Signals

robots.txt accessibility (200 response at /robots.txt)
Presence of agent-specific User-agents (GPTBot, ClaudeBot, Google-Extended, PerplexityBot, anthropic-ai, cohere-ai)
llms.txt or llms-full.txt at root
sitemap.xml declared in robots.txt or accessible at root

Rubric

5 robots.txt with explicit agent directives + llms.txt + sitemap.xml

4 robots.txt addresses agents + sitemap.xml (no llms.txt)

3 robots.txt and sitemap.xml exist but no agent-specific directives

2 Default/auto-generated robots.txt, no sitemap

1 No robots.txt, or robots.txt inadvertently blocks all agents

R2 — Initial HTML completeness

What it measures: Whether substantive content — text, forms, action paths — is present in the initial HTML, or gated behind JavaScript or third-party iframes.

Why it matters: Agents that don't execute JavaScript miss content added post-load. SPAs that render to empty shells appear blank to crawlers that don't execute JavaScript. Forms that live in third-party iframes can't be filled by an agent operating on the parent page. Sites built on frameworks that render primarily client-side, and pages whose action paths live in third-party iframes, often fail this criterion. Server-rendered pages with native forms typically pass it.

Signals

Semantic element count in initial HTML
Form count (forms in initial HTML, not iframe-embedded)
Iframe count and what's inside them
noscriptHasContent (substantial content in <noscript> indicates JS dependency)

Rubric

5 All key content and forms in initial HTML; no iframe-gated transactions

4 Mostly static; minor JS enhancements

3 Mixed; some critical paths require JS

2 Most content JS-rendered, or noscript flags significant gating

1 Empty initial HTML / SPA shell with no static content

R3 — Multilingual reach Very high public-interest stakes

What it measures: Whether the site declares its language(s), serves alternate translations via hreflang, and declares language availability in service schema.

Why it matters: A site without explicit language declarations cannot be reliably identified or prioritized in language-filtered queries. Agents must infer language from content, creating slower, lower-confidence matches that rank below sites with declared language metadata. The single most important signal for the Notice-to-Appear scenario, and for any cross-language service-finding case. Community legal aid clinics frequently translate content but fail to mark translations with hreflang, losing ranking advantages in language-specific queries and reducing discoverability to speakers of those languages.

Commercial sites face the same penalty in cross-language queries. A SaaS company serving Latin American customers without hreflang on its Spanish docs ranks below competitors that have it — even when its product is better.

Signals

<html lang="..."> set
<link rel="alternate" hreflang="..."> to translated pages
availableLanguage property in service schema
contactPoint.availableLanguage for service contacts

Rubric

5 html[lang] set, hreflang to all translated versions, availableLanguage in service schema

4 html[lang] and hreflang present, no availableLanguage in schema

3 html[lang] set, no hreflang (mono-lingual or translations not linked)

2 html[lang] missing, no translations

1 No language signals at all

Skip-if: Organization is genuinely English-only and its served audience is too. This rarely applies in US service contexts and should be a documented exception.

Pillar 3

Structure

How readable the site's claims are once parsed. Whether the page declares its type, its hierarchy, and its sections in ways that map to extractable answers.

S1 — Page type and breadcrumb

What it measures: Whether the page declares its schema.org type (Article, WebPage, ContactPage, FAQPage, Service, etc.) and provides BreadcrumbList for hierarchical context.

Why it matters: An agent extracting an answer needs to know what kind of page it's reading. A FAQPage has different extraction logic than a Service page. BreadcrumbList lets the agent place the page in site context.

Signals

jsonLdTypes includes a page-level type matching the actual content
BreadcrumbList present with correctly-ordered items

Rubric

5 Correct page-level type + BreadcrumbList + complete required properties

4 Correct type + BreadcrumbList minimal

3 Correct type, no breadcrumb

2 Generic WebPage where a more specific type clearly applies

1 No page-level type

S2 — Heading hierarchy as content map

What it measures: Whether headings form a parseable outline an agent can use to navigate the page's content.

Why it matters: Headings are how agents map a page. Multiple H1s, skipped levels, or non-descriptive headings make extraction significantly harder. Headings with stable IDs are deep-linkable, which improves citation precision.

Signals

h1Count (single descriptive H1 expected)
Logical H2/H3 progression; no skipped levels
headingHierarchyIssue (multiple H1s, or H3s without H2s)
headingsWithId ratio

Rubric

5 Single descriptive H1, logical H2/H3 progression, headings have stable IDs

4 Single H1, good hierarchy, no IDs

3 Hierarchy mostly OK; some minor issues

2 Multiple H1s, or skipped levels

1 No headings, or chaotic hierarchy

S3 — Semantic density

What it measures: Ratio of semantic landmarks (main, nav, header, footer, section, article, aside) to non-semantic div/span elements.

Why it matters: Div soup is harder to parse than semantic HTML. Agents that map pages by landmarks get nothing from a page that's all <div>. This is where MX and accessibility most clearly converge — semantic landmarks serve both screen readers and agents.

Signals

Semantic element count vs total div/span count
Ratio: semanticElements / (semanticElements + totalDivSpan)
Presence of all four major landmarks (main, nav, header, footer)
nav[aria-label] for distinguishing multiple navs

Rubric

5 Ratio > 0.05; all major landmarks present; navs labelled

4 Ratio 0.03–0.05; major landmarks present

3 Ratio 0.01–0.03

2 Ratio < 0.01 but some landmarks present

1 Div soup, no semantic landmarks

Pillar 4

Currency

Whether the site's claims are still true. Whether dates, deadlines, eligibility windows, and service availability are machine-readable.

C1 — Machine-readable dates

What it measures: Whether content has datePublished and dateModified in JSON-LD, plus appropriate <time datetime> elements.

Why it matters: Agents need to decide whether to surface information or treat it as stale. Without machine-readable dates, they guess from prose ("Updated: January 2025") — which is unreliable — or assume worst-case staleness and skip the page.

Signals

datePublished in JSON-LD
dateModified in JSON-LD
<time datetime="..."> elements in HTML (ISO 8601 format)

Rubric

5 Both datePublished and dateModified in JSON-LD; recently updated relative to content type

4 Both dates in JSON-LD

3 <time datetime> elements present, but no JSON-LD dates

2 Dates in HTML prose only (not machine-readable)

1 No date signals

C2 — Time-sensitive markup Very high public-interest stakes

What it measures: Whether deadlines, hours, event dates, and validity periods are machine-readable.

Why it matters: A Notice to Appear recipient needs deadline awareness. A clinic's walk-in hours, a registration deadline, a court date, an application window — all must be machine-readable or agents can't filter or alert appropriately. The same mechanism governs commercial time-sensitive content: a retailer's flash sale, a conference's early-bird deadline, a SaaS webinar registration. "We are open Tuesdays 9–12" in prose can't be matched by a query like "who can I see this Tuesday morning?" — the agent has nothing to filter on.

Signals

Event schema with startDate/endDate
validThrough on Offer / Service
OpeningHoursSpecification on LocalBusiness
eventStatus for cancelled/postponed events

Rubric

5 All time-sensitive content has appropriate schema

4 Most time-sensitive content marked up

3 Some time-sensitive content marked, some only in prose

2 Time information present but only in prose

1 No structured time signals on time-sensitive content

Skip-if: Page has no time-sensitive content (e.g., a static About page with no hours, dates, or deadlines).

C3 — Eligibility, cost, and service availability Very high public-interest stakes

What it measures: Whether audience, cost, jurisdiction, and availability of services are declared in machine-readable form.

Why it matters: An agent filtering for "free immigration help in Maryland in Spanish" needs four facts: service type (I2), free/low-cost, jurisdiction, and language (R3 covers language; this criterion covers cost and jurisdiction). An agent filtering for "EV charging open after 10pm in Austin" or "gluten-free meal delivery serving Brooklyn" needs the same filterable structure — service type, jurisdiction, hours, and dietary or product attributes — declared in machine-readable form. Without this markup, the page can't be filtered on cost or jurisdiction — and ranks below pages that can be, in service-discovery queries.

Signals

audience with audienceType, geographicArea, eligibility qualifications
areaServed (city, state, county, jurisdiction)
isAccessibleForFree: true for free services; priceSpecification or price: 0; sliding-scale via priceRange
availableLanguage at service level (overlaps with R3)
OfferCatalog for orgs with multiple service offerings
Operational status: businessStatus (BusinessOperating, TemporarilyClosed, PermanentlyClosed) or Offer.availability (InStock, LimitedAvailability, SoldOut)

Rubric

5 audience, areaServed, isAccessibleForFree (or price), availableLanguage, and operational status (businessStatus or Offer.availability) all declared in schema

4 Four of the five declared

3 Two or three declared

2 One declared, or all in prose only

1 None declared

Skip-if: Page is purely informational with no service offered (e.g., a glossary, a news article).

Action completion

Conditional

Scored only when the audited page contains an actionable form or CTA path: donate, volunteer, intake, appointment booking, registration, purchase, sign-up, request representation. Skipped on pages with no actionable path. Action-completion scores appear as a separate section in the report — not inside the four-pillar MLI score.

A1 — Action flow clarity

Multi-step flows declare current step (aria-current="step"); progress indicators are semantic; total step count is communicable to an agent navigating the flow.

A2 — Confirmation state semantics

Confirmation states are machine-readable: post-submission redirects to a clearly-typed confirmation URL, or the confirmation state has structured-data markup, or role="status" / aria-live="polite" regions announce completion.

A3 — Action predictability

The consequence of submitting an action is communicated semantically before the user clicks. Button labels are unique and contextual (aria-label="Donate $50 monthly to Casa Maryland", not "Submit"), and aria-describedby links action buttons to the consequence text.

Accessibility companion

Reported separately

Audited but not included in the MLI score. Accessibility (WCAG conformance) and machine legibility overlap substantially but answer different questions. A site can be highly machine-legible and partially inaccessible, or vice versa. Reporting them separately gives two diagnostics that can be acted on independently.

The companion audits patterns that overlap with WCAG 2.1/2.2 AA not already covered by S2 and S3:

Form label association (programmatic <label for> ↔ <input id>)
Form input type correctness (type="email", type="tel", type="date")
Form error states (aria-invalid, role="alert")
Button label uniqueness and descriptiveness
Native interactive elements (<button> over <div onclick>)
Modal handling (role="dialog", aria-modal, focus trap)
Tab/menu patterns (role="tablist", aria-controls, aria-expanded)
Loading and dynamic state announcements (aria-busy, aria-live)
Disabled state semantics (aria-disabled, with reason where applicable)
Empty and zero states (announced to assistive tech)

Findings rule

Early versions mapped score thresholds to priority bands automatically: every criterion scoring 1 became a P1, every 2 a P2. A low-scoring site produced 18–28 findings — overwhelming, undifferentiated, not actionable. A diagnostic tool earns its credibility one false positive at a time.

How findings are generated

Identify all criteria scored 1, 2, or 3.
Rank by score impact: how much would the total MLI move if this criterion were brought to 5?
Take the top 5 to 7 findings. Never more than 7.
If two findings prescribe the same fix, merge them.

Finding format

Pillar Identity / Reachability / Structure / Currency / Action

Criterion Canonical name, e.g., I2 — Service-type specificity

Gap What's missing, referencing actual signals from the audited page

What this means One sentence translating the technical finding into who is hurt by the gap and how — the layer that connects a missing schema field to a user the agent fails to serve

Fix Concrete HTML / schema.org / robots.txt pattern, with example markup

Effort Low / Medium / High

Report structure

The MLI audit produces a short report — three pages or equivalent — for readability. Comprehensive audit data is available in an optional appendix.

Verdict — total MLI score (0–100), band, one-sentence positioning statement.
Pillar breakdown — the four pillar scores (each 0–100), with criteria listed and individually scored.
Top fixes — 5 to 7 curated findings, ordered by score impact.
Action completion — only if applicable; three criteria scored separately.
Accessibility companion — issue count and pointer to a detailed list, separate from the MLI score.
Methodology footer — link to this document, version date, attribution.

Evidence basis

The mechanistic claims in MLI — that AI agents read structured data, that schema.org subtypes enable filtering, that hreflang declarations outperform language inference, that semantic HTML serves both screen readers and agents — are grounded in primary specifications (W3C, schema.org, crawler-operator documentation) and peer-reviewed work on AI/accessibility overlap.

Quantitative magnitude claims — how much more often schema-marked pages are cited, what percentage of single-page applications fail to render for AI crawlers — are an active area of practitioner research, not yet peer-reviewed. Most published figures originate with SEO agencies, structured-data vendors, accessibility-audit firms, or pre-rendering SaaS — sources with direct commercial interest in the finding. MLI does not stake itself to specific multipliers from studies whose authors sell the corresponding fixes.