Claude advances in chemistry: NMR and structural elucidation

Jun 5, 2026Keryc Díaz5 minutes

We're working with synthetic, computational, and analytical chemists to make Claude better at chemistry. In this release, Anthropic publishes its first paper in that effort: evaluating how Claude handles the most common analytical input in organic chemistry, an NMR spectrum.

Summary of the experiment

Why does this matter to you, whether you're a chemist or just curious about AI? Because a big part of day-to-day chemistry is translating between representations: a hand drawing, a SMILES, a peak table, or the text of a method. Each representation demands a different kind of fluency. A model that can read and cross-check these signals speeds everything from identifying compounds to integrating published results.

Anthropic compared three Claude models (Opus 4.7, Opus 4.6, Sonnet 4.6) against two dedicated NMR programs: ChemDraw and MestReNova. The test used 20 compounds taken from preprints on ChemRxiv published after the models' training cutoff, to avoid selection bias. The 20 compounds were arranged into four structural families, five per family, each representing a different type of NMR challenge.

Technical methodology

Input: each tool received the structure as SMILES and the instruction to predict peak positions for hydrogen and carbon in 1D NMR, using the same solvent as the original paper.
Measurement: each predicted peak was matched to its experimental counterpart and the difference in ppm was measured.
Correctness thresholds: ±0.20 ppm for hydrogen and ±1.0 ppm for carbon, windows a chemist would consider correct.
Repeatability: each Claude model was queried three times per compound and results were averaged; ChemDraw and MestReNova are deterministic and were run once.

Why does this design matter? Because LLMs introduce variability between runs; averaging lets you compare their established performance against traditional software.

Main quantitative results

Hydrogen: Opus 4.7 was the most accurate, with a mean error of ±0.079 ppm, less than half the acceptable threshold.
Carbon: Opus 4.7 and MestReNova were virtually tied, with ±1.37 ppm and ±1.48 ppm respectively. Opus 4.6 was intermediate and Sonnet 4.6 the least accurate.
A tough case: an NH proton in the chloropyridazine family. Opus 4.7 placed it slightly low but consistently; Opus 4.6 showed dispersion; Sonnet 4.6 put it far off (10–13 ppm), outside expectations.
Peak shape and splitting: Opus 4.7 matched coupling patterns more often than the other tools. The three Claude models predicted subpeak separations within ~0.5 Hz about 80% of the time, versus 26–35% for ChemDraw and MestReNova.
Consistency: Opus 4.7 was the most stable across its three runs, with the least run-to-run variation.

The inverse problem: deducing structure from spectra

The hardest task is the inverse one: start from the spectrum and propose the structure. Fifteen problems were presented to Opus 4.7, three runs per problem, asking for up to three candidates per attempt. The molecular formula (from high-resolution mass spectrometry) and the 1D lists of H and C peaks were provided.

Simple targets (8 hits, simple rings or two fragments): Opus 4.7 recovered all 8 structures in every run with just formula and spectra.
Difficult targets (7 hits: fused rings, spiro systems, etc.): an extra hint was added—the starting material structure. With that support, Opus 4.7 returned the correct structure in all three runs for 4 of the 7, and in two of three runs for the rest.

This shows that a generalist model without chemistry-specific fine-tuning can do 1D elucidation reasonably well with the information a chemist would normally paste into a chat.

Limitations and technical considerations

Not everything is perfect or conclusive:

Sample size: 20 compounds is informative but small. Ideally you'd evaluate hundreds of compounds covering 20–30 scaffold classes, with at least 15 per class to separate intraclass variance from tool differences.
Untested chemistries: it would help to evaluate active NH in heteroaromatics beyond chloropyridazines, untested solvents, and variants that include 2D NMR experiments.
Data: many negative results and useful data are scattered, in inconsistent formats or behind paywalls. Data scarcity and noise remain bottlenecks for faster progress.
Scope: dedicated elucidation tools have existed for decades and typically use 2D NMR to resolve ambiguities. Claude achieves a lot from 1D plus formula, but for complex cases you’ll likely need 2D and human judgment.

What does this mean in practice?

For the bench chemist: Claude (Opus 4.7) can help you predict spectra and suggest candidate structures quickly, cut repetitive work, and propose hypotheses to check experimentally.
For academic groups and SMEs: it opens the possibility of integrating a multimodal assistant that reads figures, text, and peak lists without the friction of converting everything into proprietary databases.
For tool developers: it shows that generalist multimodal models already compete with specialized software on many 1D prediction tasks, and that focusing improvements on key points (solvents, active NH, 2D) can increase adoption.

Next steps and collaboration

Anthropic says it will continue extending Claude's utility in chemistry and shares priorities: improve bottlenecks that slow chemists down, expand benchmarks, and identify where Claude saves time and where it doesn't.

If you work on a problem Claude could help with—especially if your work involves multimodal reasoning—Anthropic invites researchers to contact them at scienceblog@anthropic.com or via the AI for Science application.

Final reflection

We're not in an era where AI replaces the chemist; we're in an era where AI makes translation, searching, and first-pass triage—tasks that ate valuable time—accessible. Claude doesn't replace experimental intuition or verification, but it is becoming a practical tool that speeds identification and validation of structures. Can you imagine saving hours per compound in your synthesis workflow? That's already possible in concrete cases, and the direction is clear: better data, more diverse tests, and collaboration with real chemists.

Original source

https://www.anthropic.com/research/making-claude-a-chemist

Stay up to date!

Get AI news, tool launches, and innovative products straight to your inbox. Everything clear and useful.