Most #LLMs over-generalized scientific results beyond the original articles
...even when explicitly prompted for accuracy!
The #AI was 5x worse than humans, on average!
Newer models were the worst.🤦♂️
🔓 Accepted in #RoyalSociety Open #Science: https://doi.org/10.48550/arXiv.2504.00025
Edited 1d ago