Back to blog
Agentes de IAAI agentsn8nWhatsApp

How I Automated Domestic Worker Candidate Interviews with an AI Agent on WhatsApp

I built an n8n agent that interviews candidates for housekeeping, nanny, and caregiver positions via WhatsApp, detects incompatibilities early, and automatically generates two reports: a technical one for the recruiter and a clear summary for the hiring family.

Published on May 9, 2026·7 min read

You have 15 candidates interested in a housekeeping position. You contact them on WhatsApp one by one. The first one doesn't have her papers in order. The third one is asking for double the offered salary. The seventh lives on the other side of the city and isn't willing to commute. You've spent three hours chatting and you've only found two candidates worth presenting to the family.

And when the family asks "why do you recommend this one and not that one?" all you have are scattered notes to justify your reasoning.

The problem isn't the number of candidates. It's that the initial screening is completely manual, inconsistent, and leaves no comparable data.


The Bottleneck Nobody Sees

In domestic worker recruitment, the WhatsApp screening stage is the step that consumes the most time and produces the least useful data. It's pure conversation — no structure.

The result is an accumulation of problems:

The solution isn't a new spreadsheet or a Google Form. It's an AI agent that conducts interviews via WhatsApp autonomously, with judgment, and produces reports you can use directly.


What I Built

A three-workflow system in n8n that operates like this:

Workflow 1 — Start interview: Receives the job posting and candidate from the database, generates a personalized interview plan based on the job type (live-in, live-out, nanny, elderly caregiver) and sends the first message via WhatsApp.

Workflow 2 — Conduct the interview: Triggered by each message from the candidate. The agent reads the full context (job posting, plan, conversation history, data already extracted) and decides what to ask next, whether to follow up on something ambiguous, or whether it found an incompatibility that warrants closing.

Workflow 3 — Generate the report: When the interview closes (due to incompatibility, complete coverage, or reaching the 12-question limit), a second model analyzes the full conversation and produces two documents: a technical JSON with scores per dimension, sentiment, and intent level, and a clear Markdown summary for the family.


The Agent: Carla

The interviewing agent's name is Carla. She's a warm, professional recruiter — at least that's how she presents herself and how she behaves.

Carla knows several things from the very first turn:

Her conversation rules are simple but fundamental:

  1. One question per message. This is WhatsApp, not an in-person interview.
  2. Maximum two sentences per turn. If the response is evasive, ask again once. Just once.
  3. The data she confirms (paperwork, salary, experience, availability) is extracted as structured JSON on each turn, accumulated in Postgres.
  4. If she detects a deal-breaker, she closes immediately. She thanks the candidate warmly and leaves the door open for future positions. She doesn't keep asking.

The hard limit is 12 questions. If she reaches that point without closing due to incompatibility, she closes as "interview completed" or "limit reached," with everything she managed to cover.


How the Question Plan Works

Before the first message, a normalizing model analyzes the job description and produces a JSON with:

The 10 dimensions common to every position are stable: basic identity, current documents, prior experience, references, availability, salary match, health and physical capacity, relevant habits, communication, and motivation.

Order matters: if salary is a frequent deal-breaker, it's asked at turn 2. Not turn 10.


The Reports: One for You, One for the Family

When the interview closes, the pipeline generates two completely different documents.

Technical report (JSON): For the recruiter. Includes the match score (0–100, calculated as a weighted average of scores by dimension, with non-negotiable requirements weighted triple and the score forced to 0 if there's any deal-breaker), the recommendation (advance/maybe/reject), turn-by-turn sentiment analysis, the candidate's intent level, concrete red and green flags, and pending questions if the interview closed before covering them.

Client report (Markdown): For the family. No technical jargon. Includes the recommendation with emoji (✅ advance, ⚠️ maybe, ❌ do not advance), the two or three most relevant points about the candidate, strengths in everyday language, and "topics to discuss in person" framed as areas to explore further — not as flaws. 200–350 words. Readable in one minute.

An example of the client report for a candidate with a match score of 78:

# Marlene G. — Elderly Caregiver

**Recommendation:** ✅ Advance to in-person interview
**Compatibility with the position:** 78/100

## Most Relevant Points
8 years of experience, the last 3 caring for an elderly woman with
mild dependency. Lives nearby, paperwork in order, available immediately.

## Strengths
- Experience closely aligned with what you're looking for
- Lives nearby, low risk of delays
- Clear and warm communication throughout the conversation

## Topics to Discuss in Person
- Asking for $650,000 net (offer is $620,000) — minor gap, negotiable
- Only provided one verifiable reference. Ask for a second.
- Explore specifics of medication management for this position.

## How the Conversation Felt
Warm and honest. She asked twice about the start date and asked
for details about the neighborhood — clear signals of genuine interest.

The Technical Architecture

The full stack:

ComponentTool
Workflow enginen8n (self-hosted)
LLM (all roles)OpenAI GPT-4o
DatabasePostgres (Supabase)
Messaging channelWhatsApp via HTTP Request (placeholder)

The database has five tables: positions, candidates, interviews, messages, and reports. The interviews table accumulates the structured data extracted by the agent in a JSONB field that gets merged turn by turn using Postgres's || operator.

The agent doesn't use tool calls — it emits structured JSON on each turn with the extracted data, whether there was a deal-breaker, whether it should close, the reason, and its response to the candidate. A code node in n8n processes that JSON, decides the new interview state, and triggers the report workflow if applicable.

Estimated cost per interviewed candidate: $0.16 USD. A 12-turn interview consumes about $0.06 in agent tokens and $0.10 in the two report LLMs.


What It Solves in Practice

Before, screening 15 candidates took 3–4 hours of fragmented manual conversation. With the agent:


What's Next

The next step is connecting a real WhatsApp provider (I'm evaluating Twilio sandbox for testing and Meta Cloud API for production) and building a minimal UI in Retool or Appsmith on top of Postgres to manage positions and launch interviews without opening n8n.

If you're interested in solving something similar — whether in domestic staff recruiting or any structured qualification process via WhatsApp — the pattern is the same: agent with injected context, early incompatibility detection, structured data accumulated per turn, and auto-generated reports.

Let's talk about your case →

Does your business have this problem?

In 30 minutes I'll tell you exactly what to automate first and how much time you can recover.

Request free diagnosis