OpenAI improves ChatGPT for sensitive conversations | Keryc
OpenAI updated the default ChatGPT model (GPT-5) to better recognize signs of distress, respond more carefully, and guide people toward real help. They worked with more than 170 mental health experts and report large reductions in responses that don't meet the desired behavior: between 65% and 80% fewer in different sensitive domains.
What OpenAI did
The update focuses on three main areas: 1) severe symptoms like psychosis and mania, 2) self-harm and suicide, and 3) emotional dependence on the AI. They also expanded access to crisis lines, reroute sensitive conversations from other models to safer ones, and added gentle reminders to take breaks during long sessions.
Why does this matter? Because millions of people use ChatGPT and, while high-risk conversations are rare, when they happen they need safe, consistent responses. Want the short version? The model now fails less often at following the expected behavior according to OpenAI's clinical taxonomies.
How they did it
It wasn't magic or a quick tweak. They followed a five-step process:
Define the problem and map types of potential harm.
Measure: evaluations, real conversation data, and user studies.
Validate with external mental health experts.
Mitigate risks through post-training and product changes.
Keep measuring and iterating.
They built detailed guides called taxonomies that describe what an ideal response looks like and what it doesn't. They also created a Global Network of Clinicians with nearly 300 doctors and psychologists from 60 countries; more than 170 of them contributed directly with model responses, clinical analysis, and safety ratings.
Results and key figures
OpenAI shares both real-traffic measurements and hard structured evaluations (designed to challenge the model). Some highlighted numbers:
Estimated reduction of responses non-compliant with the taxonomies in mental health conversations: 65% in production traffic for GPT-5.
Conversations with possible signs of psychosis or mania: about 0.07% of weekly active users and 0.01% of messages.
In challenging conversations, GPT-5 reduced undesired responses by 39% compared to GPT-4o in a clinical study (n=677).
Automated evaluation of >1,000 complex conversations: GPT-5 92% compliant vs 27% for the prior model.
On self-harm and suicide:
Estimated reduction in non-compliant responses: 65%.
Users with indicators of suicidal planning: about 0.15% per week; messages with signs of ideation: 0.05%.
In challenging cases, it reduced undesired responses by 52% vs GPT-4o (n=630).
Automated evaluation: GPT-5 91% compliant vs 77% for the prior model.
On emotional dependence:
Estimated reduction in non-compliant responses: 80%.
Users with possible high attachment: 0.15% per week; messages: 0.03%.
In challenging cases, it reduced undesired responses by 42% vs GPT-4o (n=507).
Automated evaluation: GPT-5 97% compliant vs 50% for the prior model.
They also improved reliability in long conversations: they maintain about 95% reliability in especially difficult scenarios.
Important: many of these figures come from adversarial evaluations and rare examples by design. That means the results show progress on the toughest cases, not necessarily the average experience for all users.
Practical changes you might notice
More empathetic replies that suggest real help: contacting friends, family, or professionals.
Directions to local crisis lines when applicable.
Messages that avoid affirming unfounded ideas related to delusion or mania.
Recommendations to connect with people in the real world if emotional dependence is detected.
Reminders to rest during very long sessions.
Example: if you have an extended chat with the model and start to rely emotionally on the interaction, the system might encourage you to talk to someone close and offer concrete resources.
Risks and limits
Let's be clear: this is not a substitute for professional care. High-risk conversations are rare and hard to detect; small measurement variations affect the numbers a lot. OpenAI also reports inter-rater agreement between experts at 71% to 77%, which shows that even professionals sometimes disagree on the best response.
Measurements and taxonomies will keep changing. What is measured one way today might be measured differently later, so OpenAI warns that direct comparisons between versions can be complicated.
An important step, but not the end
This update shows that safety improvements can be concrete: working with experts, evaluating hard cases, and adjusting models produces measurable results. But it also reminds us that this is an ongoing process. Technology helps, but human networks matter: friends, family, and mental health professionals remain the cornerstone.
If you're worried about a conversation of your own or someone you know, seek professional help or a crisis line in your country. AI can accompany, but it does not replace human care.