Immigration Legal Aid Case Study

The stakes

A Notice to Appear (NTA) is the charging document the U.S. government issues to initiate removal proceedings against a noncitizen. It sets a court date and formally places someone in the immigration system — often with little warning, and often without explanation of what it means or what options exist.

Immigration courts are backlogged by millions of pending cases. Hearings are regularly scheduled years out, but time-sensitive decisions — finding a lawyer, understanding rights, identifying which organizations serve your area and case type — must be made immediately. Most NTA recipients speak Spanish as a first language. Most cannot afford a private attorney. And unlike in criminal proceedings, there is no right to appointed counsel in immigration court: people face removal proceedings alone unless they can find representation themselves.

AI assistants are increasingly the first tool people reach for in a crisis — asking ChatGPT or Google a question on their phone. For an immigrant who just received an NTA, that question might be: "¿Dónde puedo conseguir ayuda legal gratis en Chicago?" What gets surfaced in the answer is not a neutral result. It reflects whose content is machine-readable, whose services are declared in structured data, and whose voice the agent can find.

In progress — copy editing

Documented cases and statistics to be added here. This section will include recent reported cases illustrating the NTA situation — scale of issuance, outcomes without legal representation, language access failures, and the gap between available resources and people who can find them.

About ICIRR

The Illinois Coalition for Immigrant and Refugee Rights (ICIRR) is one of the largest immigrant advocacy coalitions in the Midwest. Its Family Support Network (FSN) page is a central referral point for Illinois residents navigating immigration proceedings — linking to three PDFs that together cover 150+ organizations: a nonprofit matrix by case type and geography, an 80+ profile directory with intake hours and languages, and a private attorney referral list of ~90 firms.

When a Spanish-speaking resident asks an AI assistant for help during an ICE encounter, ChatGPT returns the ICIRR hotline number accurately. But ICIRR doesn't own that answer — AI cites Chicago.gov and a food bank coalition, not icirr.org. This audit goes one layer deeper: can AI access the actual resource layer — the PDFs — that contain the help people need?

The three-layer test

Three sequential questions — each a prerequisite for the next.

Pass

Layer 1

Does AI find the right org?

ChatGPT returns the ICIRR hotline (855-435-7693) accurately in English and Spanish.

Fail

Layer 2

Does ICIRR control that answer?

AI cites Chicago.gov and a food bank coalition — not icirr.org. ICIRR has no structured signal and no mechanism to correct errors if the citation becomes wrong.

Partial

Layer 3

Can AI access the resource PDFs?

Browsing agents can follow links from icirr.org/fsn. Answer engines (ChatGPT, Google, Perplexity) cannot — the content is not indexed anywhere.

The three resource PDFs

ICIRR's FSN page links to three PDFs containing the actual help people need. The crisis use case in each row is the query a resident might ask an AI assistant in the hours after receiving an NTA.

Resource

What it contains

Crisis use case

PDF 1 — Nonprofit matrix

~60 Illinois nonprofits by case type (family visa, DACA, deportation defense) and geography. Revised December 2025.

"Which nonprofit near me handles deportation defense?"

PDF 2 — Detailed directory

82-page profile directory: each nonprofit's address, intake hours, languages, fees, and which forms they handle. Fall 2025.

"Can I call somewhere right now? Do they take phone intake in Spanish?"

PDF 3 — Attorney list

~90 private immigration firms: address, phone, languages, specialties. Updated May 2025.

"I need a lawyer who speaks Spanish and handles detention."

Can AI find these links?

Before an agent can read the PDFs, it must first locate them. All three links follow this pattern:

https://e2a3fb6c-c4a7-47af…filesusr.com/ugd/9781a6_[opaque-hash].pdf

Wix's file CDN. The domain carries no identity for ICIRR. Filenames are opaque hashes. No schema.org markup on the linking page — no DigitalDocument, no LegalService type.

Browsing agents

Can follow links

Partial

Claude in Chrome, ChatGPT with browsing — visit the FSN page, read anchor text, follow the link. Access is possible but only if the agent is already at icirr.org/fsn.

Answer engines

Cannot cite content

Blocked

ChatGPT, Google AI Overview, Perplexity — cite from indexed training data. The filesusr.com CDN is not crawlable. None of this content exists in any AI training set.

Can AI read the PDFs once it finds them?

All three PDFs were fetched and text-extracted directly, simulating what a browsing agent does after following a link. Results differ significantly across the three documents.

PDF 1

Nonprofit matrix

Partial

Text extractable ✓ Yes

Phone numbers ✓ All readable

Table structure ~ Columns lost in extraction

Schema markup ✗ None

Spanish version ✗ None

Works for simple queries (phone number, org name). Fails for nuanced queries — find an org that handles both DACA and asylum in the suburbs — because the matrix column structure is lost in extraction.

PDF 2 — highest value

Detailed org directory

Most invisible

Text extractable ✓ All fields clean

Profile structure ✓ Field labels intact

Intake hours + method ✓ Phone / email / walk-in

Languages spoken ✓ Per org

Schema markup ✗ None

Visible to answer engines ✗ Not indexed

Critical finding. PDF 2 contains exactly what AI needs to connect a resident with help: call this number, they speak Spanish, intake is free, open Monday–Friday 9–5. Structured, accurate as of Fall 2025, and completely inaccessible to AI answer engines.

PDF 3

Private attorney referral list

Partial

Firm names, phones, addresses ✓ Yes

Specialty data ~ Free-text, inconsistent

Schema markup ✗ None

Visible to answer engines ✗ Not indexed

Readable by a browsing agent after following the link. But specialty data is free-text — one entry says "Spanish; detention", another lists only a URL. AI cannot reliably filter 90+ attorneys by specialty. Invisible to answer engines.

The gap isn't the content — it's the container

The central finding is not that the PDFs are unreadable. They are. The finding is that storing this data as PDFs on a third-party CDN makes the entire knowledge layer invisible to the AI systems most likely to be used in a crisis — the assistants people open on their phone and ask where to call.

ICIRR has done the hard work: 150+ organizations curated, profiles maintained, attorney list updated quarterly. The knowledge infrastructure exists. The problem is the format — a container AI cannot index, cannot cite, and cannot surface to the people who need it.

Format

Browsing agent

Answer engine

PDF on third-party CDN

Partial — after following link

No — not indexed

Unstructured HTML text

Yes

Partial — text only, untyped

HTML with schema.org JSON-LD

Yes

Yes — citable, typed, attributed

What the fix looks like

Every field in PDF 2's org profiles maps directly to schema.org LegalService + ContactPoint + OpeningHoursSpecification. This is not a redesign — it is adding a structured data layer to content that already exists.

{
  "@context": "https://schema.org",
  "@type": "LegalService",
  "name": "Beyond Legal Aid",
  "url": "https://beyondlegalaid.org",
  "telephone": "+1-312-999-0056",
  "availableLanguage": ["Spanish", "Arabic", "South Indian languages"],
  "priceRange": "Free",
  "isAccessibleForFree": true,
  "serviceType": ["Deportation Defense", "Asylum", "DACA", "Removal Defense"],
  "areaServed": { "@type": "State", "name": "Illinois" },
  "contactPoint": {
    "@type": "ContactPoint",
    "contactType": "intake",
    "telephone": "+1-872-267-2252",
    "availableLanguage": ["Spanish", "Arabic"]
  },
  "openingHoursSpecification": {
    "@type": "OpeningHoursSpecification",
    "dayOfWeek": ["Monday","Tuesday","Wednesday","Thursday","Friday"],
    "opens": "09:00",
    "closes": "17:00"
  }
}

Converting PDF 2's 80+ profiles to this format makes them citable by any AI system, filterable by language and service type, and accessible to a Spanish-speaking resident on their phone — without a browsing agent, without knowing icirr.org exists. Phase 1 (top 20 orgs) is estimated at under 40 development hours.

Summary of findings

Partial

ICIRR hotline surfaces in AI results — but not from icirr.org.

ChatGPT returns the right number, sourced from Chicago.gov and a food bank coalition. ICIRR has no structured signal and no mechanism to correct that citation if it becomes wrong.

Partial

All three PDFs are readable by a browsing agent after following the link.

The help chain works — barely — if someone is already using a browsing-capable AI and already knows to visit icirr.org/fsn. That is a very narrow success condition.

Fail

None of the PDF content is visible to AI answer engines.

ChatGPT, Google AI Overview, and Perplexity have never seen this content. A resident asking "where do I call right now?" will not receive an answer sourced from these resources.

Fail

PDF 2 (the detailed directory) is the most valuable and the most invisible.

It contains exactly what AI needs: intake hours, languages, phone numbers, fee structures. Structured, accurate, completely inaccessible to the AI systems most used in a crisis.

Pass

The fix is well-defined and medium effort.

LegalService + ContactPoint + OpeningHoursSpecification schema.org markup on web pages. Phase 1 (top 20 orgs) estimated under 40 dev hours.

Audit status

The findings above are based on static analysis — fetching and extracting the PDFs directly, and inspecting the FSN page source. Two live-agent tests remain before this audit is complete.

Open question 1

Do the PDF links appear in the live DOM — or are they behind a JavaScript interaction that blocks agents?

How to answer

Use Chrome extension or a browsing agent to inspect the live FSN page. Confirm links are present in the initial DOM, not loaded by JS scroll or click.

Success looks like

Links present and followable in live DOM audit. If JS-gated, the "partial access" finding for browsing agents becomes a full failure.

Open question 2

When a browsing agent follows the link to PDF 2, can it return a specific org's phone number for a Spanish-language deportation defense query?

How to answer

Task a browsing agent: "Find me a free org in Chicago that does deportation defense and takes phone intake in Spanish." Record whether it follows the PDF link and returns usable data.

Success looks like

Agent returns a name, phone number, and confirms Spanish intake — sourced from PDF 2. A recorded failure is the proof point that makes this story undeniable.

Run the same audit on your site

The MLI framework and criteria are open. If you run public-interest services — legal aid, housing, health, civic access — this audit pattern applies directly to your resource pages.

How to Run an Audit Read the Full Methodology

When you need help in Chicago, can AI find it?

The stakes

About ICIRR

The three-layer test

The three resource PDFs

Can AI find these links?

Can AI read the PDFs once it finds them?

Nonprofit matrix

Detailed org directory

Private attorney referral list

The gap isn't the content — it's the container

What the fix looks like

Summary of findings

Audit status

Run the same audit on your site