
Research
LLMs believe false statements even after explicit warnings that they're false
Research shows that large language models continue to confidently represent false claims as true even when explicitly warned about their inaccuracy. This bias toward treating training data as factual poses significant challenges for AI reliability and misinformation spread.
Read full story at Ars Technica →V: · A: · D:
Related
Research
Nothing from Something: Can a Language Model Discover 0?
This arxiv paper uses the concept of zero as a test case for whether language models can engage in genuine mathematical ...
Research
Relational Structural Causal Models
Researchers have extended Pearl's structural causal models to settings where objects and their relations vary, addressin...
Research
A Definition of Good Explanations and the Challenges Explaining LLM Outputs
This arxiv paper proposes a formal definition of what constitutes a good explanation, drawing on counterfactual reasonin...