Research

WorkBench Revisited: Workplace Agents Two Years On

AI workplace agents have improved dramatically since 2024, with the best model now completing 89% of tasks compared to 43% two years ago, while harmful actions dropped from 26% to 2.5%. The research shows capability and safety improvements go hand in hand rather than trading off against each other. However, frontier models still make basic mistakes that can cause irreversible harm, such as sending emails to wrong recipients.

Read full story at Import AI →V: · A: · D:

Research

Reinforcement Learning Towards Broadly and Persistently Beneficial Models

Researchers have published findings suggesting that reinforcement learning on carefully constructed datasets of benefici...

Research

Commemorating 70 Years of Artificial Intelligence

IEEE Spectrum marks seventy years since the Dartmouth workshop formally named artificial intelligence as a field, offeri...

Research

Diffusion Language Models: An Experimental Analysis

Researchers present a systematic evaluation of eight diffusion language models across eight benchmarks covering reasonin...