Resources / Interview Questions
Legal AI & Automation interview questions (2026)
Recruiter, hiring-manager, behavioral, and technical questions for the role that evaluates, pilots, and operates AI-assisted legal workflows — tool selection, prompt design, hallucination mitigation, attorney-in-the-loop gates, governance, and measured adoption.
Recruiter-screen questions
The recruiter screen should test tool fluency, partnership posture with attorneys, and whether the candidate has actually deployed and retired workflows — not just evaluated tools.
Which legal-AI tools have you piloted in production?
Listening for specific tools (Harvey, CoCounsel, Lexis+ AI, Hebbia, Spellbook, Robin AI, EvenUp) and workflows where they were deployed, not just evaluated.
Walk me through your last AI workflow rollout end-to-end.
Looking for: workflow selection rationale, baseline metric, tool selection, prompt design, attorney-in-the-loop decisions, rollout cadence, measured outcome.
What AI tool did you evaluate and decide NOT to deploy?
Strong candidates have a story about the rejection. "I've never seen a bad tool" is a red flag.
How do you partner with attorneys who are skeptical of AI?
Looking for: workflow co-design, attorney as validator, transparent failure-mode discussion, gradual scope expansion.
What is your background — legal, tech, or both?
Both shapes work. Legal-first candidates need demonstrated technical depth; tech-first candidates need demonstrated legal-domain instinct.
Have you written or co-written an AI usage policy?
Strong candidates have, in partnership with GC and CISO. Anything less is operating without governance.
Hiring-manager-screen questions
The Legal Ops Manager or Director conducting this screen should test workflow judgment, prompt evaluation rigor, governance fluency, and ability to retire what does not work.
Walk me through a workflow where you decided AI did NOT beat the human baseline.
Strong candidates have at least one of these. The reasoning matters as much as the conclusion.
How do you choose between retrieval-augmented generation and fine-tuning for a given workflow?
Looking for: data-availability framing, latency tradeoffs, maintenance burden, governance implications.
Describe your prompt evaluation methodology.
Strong candidates have an evaluation harness with reference outputs, regression tests, and metric-based scoring. Ad-hoc spot-checking is junior-tier.
How do you handle PII or privileged content in AI workflows?
Looking for: data classification, deployment model (vendor-hosted, private-tenant, on-prem), audit logging, retention controls.
How do you measure adoption of a deployed AI workflow?
Looking for: usage volume, completion rate, time saved per task, quality metrics, and explicit user-feedback cadence.
What governance evidence do you maintain for SOX or audit?
Strong candidates name specific artifacts: policy versions, training completion records, exception logs, vendor security reviews.
Tell me about a workflow that needed to be retired.
Mature AI practitioners retire workflows. Candidates who only narrate launches are missing half the job.
Behavioral questions
AI & Automation behavioral questions focus on stakeholder partnership, hallucination response, and the willingness to say no to an AI deployment.
Tell me about a time you pushed back on a stakeholder asking for an AI solution.
Looking for: scoped concern, alternative proposed, relationship preserved.
Describe a time an AI tool produced an output you knew was wrong.
Strong candidates have caught hallucinations or retrieval failures in production. Detection methodology matters.
Walk me through how you would respond if a deployed workflow surfaced a privacy or privilege issue.
Looking for: immediate containment, partner notification (GC, DPO, CISO), root-cause analysis, evidence preservation.
Tell me about feedback from an attorney that changed your approach.
Strong candidates can name a specific shift in workflow design, governance, or communication.
Describe a vendor evaluation you led.
Looking for: evaluation criteria, demo-vs-pilot distinction, reference checks, security review, total cost (including change management).
Technical questions
Use these themed questions to probe the load-bearing skills: tool selection, prompt design and evaluation, hallucination and quality, governance and risk, adoption and measurement.
Tool selection
- Walk me through how you would choose between Harvey, CoCounsel, and Lexis+ AI for a given workflow.
- When do you prefer a domain-specific tool over a general-purpose LLM API?
- How do you evaluate a vendor's claims about hallucination rate?
- Describe a reference-check process for legal-AI vendors.
Prompt design and evaluation
- Walk me through a prompt you have iterated on. What changed and why?
- Describe your evaluation harness for prompt regression.
- How do you design structured outputs (JSON schemas) for downstream processing?
- When do you use few-shot examples versus zero-shot prompts?
Hallucination and quality
- How do you distinguish hallucination from retrieval failure?
- Walk me through your quality-monitoring approach for a deployed AI workflow.
- How do you decide when an attorney must review an AI output?
- What is your incident response when an AI output reaches production with an error?
Governance and risk
- Walk me through the AI usage policy you have operated under.
- How do you handle data classification for AI prompts (PII, privileged, MNPI)?
- What evidence do you maintain for SOX-relevant AI workflows?
- How do you partner with the CISO and DPO on AI tool approval?
Adoption and measurement
- How do you measure baseline before AI rollout?
- Walk me through an adoption curve you have seen flatten. What did you do?
- How do you decide when a workflow is mature enough to remove the attorney-in-the-loop gate?
- Describe a workflow you retired and why.
Take-home and on-site exercises
Three exercises that produce real signal at this role tier:
Workflow selection memo
Hand the candidate a one-page brief: "The Contracts team spends 20 hours a week on NDA review. Recommend an AI approach." Ask them to draft a one-page memo: workflow design, tool selection rationale, attorney-in-the-loop gates, measured success criteria, rollout plan, and a risk callout. Tests workflow judgment, tool fluency, and written communication.
Prompt design and evaluation
Give the candidate a prompt for clause extraction with five known-bad outputs (deliberately seeded — hallucinated clause, wrong citation, missed clause, format violation, privilege leak). Ask them to redesign the prompt, define evaluation criteria, and propose the regression test. Tests prompt-engineering depth and evaluation rigor.
Governance memo
Ask the candidate to draft a half-page memo to the GC: "We've been asked to approve an AI tool for first-draft contract redlining. What governance gates do we need before saying yes?" Tests partnership posture, governance fluency, and ability to translate technical work for executive audiences.
What good and bad look like
Red flags
- Cannot name specific tools they have piloted in production.
- Treats hallucination as a vendor problem rather than a workflow-design problem.
- Has never decided NOT to deploy an AI tool after evaluation.
- Talks about attorney-in-the-loop as a slogan, not a designed control.
- Has no evaluation methodology beyond spot-checking.
- Cannot describe a workflow they retired.
- Treats AI governance as compliance overhead, not workflow design.
What strong answers sound like
- Names specific tools, workflows, and measured outcomes from prior rollouts.
- Distinguishes hallucination from retrieval failure with specific mitigations.
- Has at least one rejection story (tool evaluated, not deployed).
- Has at least one retirement story (workflow deployed, then retired).
- Describes attorney-in-the-loop as a deliberate design choice per workflow.
- Has a written evaluation harness with regression tests.
- Has co-authored or operated under an AI usage policy.
- Names data-classification decisions for prompts (PII, privileged, MNPI).
What strong candidates ask you
The questions a candidate asks reveal what they think the job is. These are the questions a serious Legal AI & Automation candidate brings to the interview:
- What AI tools are currently approved for legal-team use?
- What does the AI usage policy look like, and who owns it?
- Which workflows are highest priority for AI assistance in the next twelve months?
- How does Legal partner with the CISO and DPO on AI tool approval today?
- What attorney-in-the-loop gates exist in current deployments?
- How is AI adoption measured today, and where can we improve?
- What does success look like at 90 and 180 days?
- How does the GC feel about AI adoption pace? What is the constraint?