JusticeBench

What It Does

At many legal aid groups, intake is the bottleneck. At Legal Aid Society of San Bernardino, the organization aims to serve 2.2 million county residents. Phone lines are open Monday through Friday, 8 AM to 5 PM, with an hour lunch break. A successful intake takes 40–70 minutes: a 10–15 minute prescreen by an intake worker, a 2–10 minute conflict check (caller on hold), then a 20–30 minute detailed interview by a paralegal, followed by manual data entry into LegalServer. Callers in crisis wait on hold. Shift workers, parents, and people in unsafe situations can't call during business hours. Paralegals spend their days on back-to-back calls asking the same 114 questions while typing into LegalServer in real time.

The AI Voice Intake System replaces the hold time and repetitive questioning. A caller dials in and reaches an AI assistant immediately. The system conducts the entire intake in a single phone call: language selection, eligibility screening, prescreen interview (32 questions), and detailed intake (11 sections covering finances, housing, legal issues, adverse parties, and more). As the conversation proceeds, structured data is extracted and auto-populated into intake forms visible to staff on a real-time admin dashboard. When the call is done, an attorney reviews a complete, structured intake and calls the client back.

How the Call Works

Language selection. The system greets callers in English and Spanish. Callers choose by speaking or pressing a keypad number.

Session management. Each session gets a unique 4-digit reference number. If the call drops, the caller calls back and resumes where they left off.

Opening disclaimer. A static script explains: this is an AI, not a lawyer. It cannot give legal advice. LASSB staff will review all information. The caller can request a human at any time.

Eligibility gate. The AI collects name, zip code, household size, income, and adverse party name, running checks as data comes in. Service area is verified against the full list of San Bernardino County zip codes. Income is checked against 125% of Federal Poverty Guidelines. If either check fails, the caller is informed and offered an appeal option (attorney callback). No caller is permanently denied without human review.

Conflict check. When the adverse party's name matches LASSB's conflict database, the caller is placed on hold (smooth jazz). Two things happen simultaneously: a notification appears in the admin dashboard and an email goes to the attorney. The attorney clicks "No Conflict" (call continues) or "Confirmed" (caller offered callback). If no attorney responds within two minutes, the system defaults to a callback offer.

Prescreen interview. 32 questions covering emergency status, demographics, disability, contact information, address, attorney information, legal issue description, court case numbers, and deadlines.

Detailed intake. Automatic transition from prescreen on the same call, with a warning that this section may take up to an hour. Eleven sections: non-adverse parties, financial details, community impact, demographics, housing/tenancy, survey consent, legal issue classification, landlord-tenant matter details, notice details and eviction risk, rental housing unit information, and wrap-up.

Call completion. The caller is informed that an attorney will follow up, reminded of their reference number, and thanked.

The Tech Stack

The system runs on Replit with Twilio for phone handling, Claude Haiku (claude-haiku-4-5) for conversation, OpenAI gpt-4o-mini-transcribe for speech-to-text, and Google Chirp3-HD for natural text-to-speech voices in English and Spanish. The frontend (React + Vite + TailwindCSS) serves the admin dashboard. The backend (Express.js with TypeScript) handles call flow logic. Data lives in PostgreSQL (Neon-backed via Replit).

Estimated cost per call (45-minute average): $1.50–2.50, covering Twilio minutes, Claude Haiku API, and OpenAI STT.

Key Technical Features

Real-time admin dashboard. Staff see a live view of all intake sessions: caller name, reference number, status, phase. Clicking a session reveals collapsible form sections with completion indicators, auto-filled fields populating as the AI extracts data, and a chat-bubble transcript panel showing the full conversation.

Context-aware data extraction. A multi-layer pipeline extracts structured data from the conversation. Primary extraction pulls field/value pairs from the AI's responses. A context-aware fallback includes the previous AI question so short answers ("No" in response to "Are you a veteran?") map correctly. Cross-population rules auto-fill related fields (gross monthly income populates income amount, frequency, and annual total; city populates county, state, and county of dispute).

Comprehensive dropdown normalization. AI-extracted values are mapped to the exact options LegalServer expects. Field-specific synonym maps handle race/ethnicity, SSN status, citizenship, property type, landlord type, income frequency, and 40+ boolean fields. This normalization runs after extraction and before database save.

Client name protection. A code-level block prevents adverse party names, attorney names, or other mentioned names from overwriting the caller's identity during prescreen and detailed intake. This is critical for conflict checks.

Intelligent speech processing. Speech hints improve Twilio's recognition for number words, dollar amounts, legal terms, and email domains. A number normalization function converts spoken amounts to dollar values ("two thousand dollars" becomes "$2000"). Low-confidence speech below 15% is rejected as barge-in artifacts, but numeric content bypasses this filter because Twilio frequently assigns low confidence to valid monetary responses. Every prompt accepts both speech and keypad (DTMF) input.

N/A auto-population. When a parent question indicates "No" or "not applicable," all conditional child fields are automatically filled with "N/A" so they appear as intentionally skipped, not missed.

Three-layer UPL defense. Layer 1: prompt-level prohibitions with five categories of banned responses (directive, evaluative, predictive, interpretive, opinion-based). Layer 2: a post-generation filter scans every AI response against 28 regex patterns (16 English, 12 Spanish) before the caller hears it. Hard violations replace the response with a safe redirect. Soft violations are logged for review. Layer 3: all violations logged with [UPL-AUDIT] tags, visible in the admin transcript with red badges but excluded from the AI's context window.

Design Principles

Trauma-informed first. Callers are in crisis. The AI uses empathetic, patient language, never rushes the caller, asks one question at a time, and uses natural acknowledgments rather than echoing back sensitive answers. At the same time, the caller always knows they are speaking to AI.

Eligibility upfront. Check eligibility before the full interview to save everyone's time. All denials include an appeal option for attorney review.

Always a human option. Callers can request a human at any point. After three consecutive misunderstandings, the system offers to transfer to staff.

Bilingual with natural voices. Full Spanish-language support using Google Chirp3-HD voices, not translated prompts. (Spanish voice feature built but not yet tested.)

What They Tested

The team conducted live phone demos with LASSB staff, peer reviews with other class teams, and internal test calls. They built 12 synthetic test scenarios: 7 testing eligibility logic (standard eligible, conflict detection, geographic ineligibility, income ineligibility, bilingual/subsidized housing) and 5 testing full intake with diverse caller personas (Spanish speaker with health issues, crying caller with baby and imminent sheriff lockout, veteran with PTSD and service dog, retaliation eviction with documentation, wheelchair user with SSI disability).

Key feedback: spelling accuracy is critical for conflict checks (the system now asks callers to spell names). AI cannot make conflict check decisions (attorney must). LegalServer API connection is "straightforward." Need auto-delete policies and system encryption for production. "High volume: we want to get intake done as quickly as possible."

Risks and Limitations

Spanish not yet tested. The bilingual system is built but unvalidated in Spanish.

LegalServer integration pending. The admin dashboard currently stands alone. Production use requires a direct API push to LegalServer. LASSB confirmed the connection is straightforward.

No real-caller testing. All testing has been with synthetic scenarios and staff. The system has not been tested with actual callers in crisis.

Data privacy. The system collects SSN, income, housing situation, domestic violence status. Formal data retention, encryption, and access control policies are required before public deployment. Greg specifically requested auto-deletion policies and encryption.

Speech recognition limitations. Callers with heavy accents, noisy environments, or emotional speech may trigger recognition failures. The three-strike rule and DTMF fallback mitigate but don't eliminate this.

What's Next

LegalServer API integration to push intake data directly into case management. Expanded edge-case testing across all 100+ questions with diverse caller scenarios. Spanish language validation. Staff training on the admin dashboard, conflict check resolution, form review, and taking over calls mid-conversation. Performance optimization to reduce response latency.

Long-term: 24/7 access (callers outside business hours), additional languages (Mandarin, Vietnamese, Korean), court system integration, and scaling as a template for other legal aid organizations.

Why It Matters for the Field

The system covers the entire intake workflow: not just a subset of questions or a single phase, but all 114 questions across prescreen and detailed intake, with real-time structured data extraction, eligibility checks, conflict review with human-in-the-loop, and admin dashboard visibility.

Several patterns are worth studying independently. The three-layer UPL defense (prompt prohibitions, regex post-filter, audit trail) is the most rigorous approach to unauthorized practice of law prevention documented in any of these projects. The dropdown normalization layer that maps AI-extracted values to exact LegalServer options solves a problem every legal aid organization will face when connecting conversational AI to structured case management systems. The conflict check flow (hold music, simultaneous dashboard notification and email, attorney resolution buttons, timeout-to-callback default) models how human-in-the-loop review can work in real time during a phone call without breaking the caller's experience.

The cost comparison is striking: $1.50–2.50 per call versus the fully loaded cost of a human intake specialist conducting the same 45-minute interview.

Housing Intake Voice AI Line

About This Project