Meta is working on two proprietary frontier models: Avocado, a large language model, and Mango, a multimedia file generator. The open-source variants are expected to be made available at a later date.
We asked seven frontier AI models to do a simple task. Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights - to protect their peers. We call this phenomenon 'peer-preservation.'
AI agents need skills - specific procedural knowledge - to perform tasks well, but they can't teach themselves, a new research suggests. The authors of the research have developed a new benchmark, SkillsBench, which evaluates agentic AI performance on 84 tasks across 11 domains including healthcare, manufacturing, cybersecurity and software engineering. The researchers looked at each task under three conditions: