Anthropic presents Anthropic Interviewer, a tool that used Claude.ai to run 1,250 interviews with professionals to understand how people integrate AI into their work. What do they feel? What do they delegate and what do they want to preserve of their professional identity? And how can AI be used to research at scale without losing the human voice?
What is Anthropic Interviewer
Anthropic Interviewer is an automated interviewer designed to run real qualitative interviews at scale. It works with a combination of prompts, AI-generated interview plans, and human review. The idea isn’t to replace researchers, but to multiply their reach while keeping human checks at critical points.
The initial trial: 1,250 interviews with three subsamples. 1,000 participants from the general occupational set, 125 creatives and 125 scientists. Interviews lasted 10–15 minutes in Claude.ai, transcripts were anonymized and publicly released with consent for research.
How it works: planning, interviewing and analysis
Planning: a system prompt sets goals and a flexible rubric. The AI generates questions and a conversational flow that human researchers then review.
Interview: the AI conducts adaptive conversations in real time, following qualitative research best practices. Each interview lasted around 10–15 minutes.
Analysis: transcripts went through automated analysis to identify emerging themes and a human review that interpreted results and extracted illustrative quotes.
Technically, the flow is: system prompt -> question generator -> real-time interviewer -> transcription -> automated thematic clustering -> human review. That loop allows fast iteration on the methodology.
Main findings (quantitative and qualitative summary)
Scale and confidence in the numbers: 86% of professionals said AI saves them time; 65% were satisfied with AI’s role at work.
Creatives: 97% reported time savings; 68% said the quality of their work improved. At the same time, economic anxiety and social stigma emerged within their communities.
Scientists: 91% want more assistance from AI, but trust is low for core tasks like hypothesis generation and experimental design.
Augmentation vs automation: 65% described AI as augmentative and 35% as automative in their primary use. Previous usage data in Claude showed a more even split, suggesting discrepancies between self-report and actual practice.
Concrete examples found in the interviews:
General: an administrative assistant said AI is like a computer or typewriter that lets you do more; another worker said they avoid saying how they work for fear of colleagues judging them.
Creatives: a music producer uses prompts to get lists of word combinations that serve as seeds for lyrics; a photographer reduced editing time from 12 weeks to 3.
Scientists: they use AI to review literature and debug code, but don’t yet use it to make experimental decisions without human verification.
Technical and methodological analysis
Validation and verification: interviews show the most cited technical limitation is lack of reliability and the tendency to invent information. In technical terms: the problem of hallucinations remains central.
Bias and sampling: recruitment via gig-work platforms generates selection bias. The results are early signals, not definitive population estimates.
Privacy and security: informed consent and anonymization were used; even so, researchers highlight concerns for classified environments or sensitive data.
Important: the tool combines automation with human review. It is not a purely autonomous pipeline; humans remain the final layer of trust.
Limitations and identified risks
Selection and demand bias: participants knew the interview was about AI conducted by AI, which may alter responses.
Self-report vs actual behavior: there were discrepancies between what people say and what usage metrics show.
Limited emotional analysis: text alone doesn’t capture tone or body language; nuances can be lost.
Globality: the sample mostly reflects Western contexts; many cultural perspectives are missing.
Operational technical risks include filtering sensitive data, prompt dependence, and the need to mitigate sycophancy (when the model overly adjusts to what it thinks the user wants to hear).
What this means for professionals and product teams
For workers: wider adoption brings role reconfiguration. Many imagine moving from executing tasks to supervising and training systems.
For creatives: AI speeds workflows, but increases competitive pressure and raises questions about authorship and the market.
For scientists: there is strong interest in tools that propose hypotheses and synthesize data; the technical priority is improving reliability and automated verification.
For product teams: using automated interviews as continuous feedback lets you prioritize improvements (reducing hallucinations, source traceability, better verification interfaces).
Collect regular interviews with Anthropic Interviewer and cross-reference them with objective usage telemetry.
Use automated thematic analysis to detect early trends and test product changes in controlled experiments.
Involve affected communities (creatives, scientists, educators) in participatory design and governance.
Technically, this means improving prompts, integrating better fact-checking models, and developing pipelines that produce outputs with source traceability and confidence levels.
Future and open questions
How do we close the gap between self-perception and real usage data? Longitudinal studies combining interviews and telemetry are needed.
Can interviews be extended to multimodal formats to capture nonverbal emotions? Audio and video would help enrich emotional reading.
How do we measure real economic impact in creative sectors if automation displaces concrete jobs?
Anthropic Interviewer shows you can ask at scale without losing the human voice. But the answers bring up more questions: what tasks do you want to delegate to AI and which do you prefer to keep for yourself?