#ai-safety
#ai-safety

Artificial intelligence

Anthropic Is at War With Itself

fromTechRepublic

Artificial intelligence

New Sundance Film Examines AI Anxiety, Power, and the Future of Humanity - TechRepublic

Artificial intelligence

This train isn't going to stop': shocking Sundance film shows promises and perils of AI

Artificial intelligence

Anthropic's new Claude 'constitution': be helpful and honest, and don't destroy humanity

10 hours ago

Artificial intelligence

From nerdy' Gemini to edgy' Grok: how developers are shaping AI behaviours

fromwww.nytimes.com

4 days ago

Artificial intelligence

Video: Can You Teach an A.I. Model to Be Good?

fromThe Atlantic

Artificial intelligence

Anthropic Is at War With Itself

fromTechRepublic

Artificial intelligence

New Sundance Film Examines AI Anxiety, Power, and the Future of Humanity - TechRepublic

Artificial intelligence

This train isn't going to stop': shocking Sundance film shows promises and perils of AI

Artificial intelligence

Anthropic's new Claude 'constitution': be helpful and honest, and don't destroy humanity

more#ai-ethics

fromenglish.elpais.com

11 hours ago

Yoshua Bengio, Turing Award winner: There is empirical evidence of AI acting against our instructions'

AI capabilities are advancing rapidly—showing incidents of acting against instructions—outpacing risk management and creating misuse, manipulation, dysfunction, control loss, and systemic harms.

UK news

fromBusiness Matters

14 hours ago

ICO opens formal investigation into Grok AI over data protection and harmful imagery concerns

The ICO has launched formal investigations into X Internet Unlimited Company and X.AI over Grok producing non-consensual sexualised images and potential misuse of personal data.

#ai-agents

Artificial intelligence

Alarm Grows as Social Network Entirely for AI Starts Plotting Against Humans

fromEntrepreneur

Artificial intelligence

New Social Network for AI Bots Raises Red Flags

fromAxios

3 days ago

Artificial intelligence

"We're in the singularity": New AI platform skips the humans entirely

Artificial intelligence

Alarm Grows as Social Network Entirely for AI Starts Plotting Against Humans

fromEntrepreneur

Artificial intelligence

New Social Network for AI Bots Raises Red Flags

fromAxios

3 days ago

Artificial intelligence

"We're in the singularity": New AI platform skips the humans entirely

Mental health

New Study Examines How Often AI Psychosis Actually Happens, and the Results Are Not Good

2 weeks ago

Artificial intelligence

An OpenAI safety research lead departed for Anthropic

2 days ago

Mental health

New Study Examines How Often AI Psychosis Actually Happens, and the Results Are Not Good

2 weeks ago

Artificial intelligence

An OpenAI safety research lead departed for Anthropic

more#mental-health

fromEngadget

5 days ago

Amazon discovered a 'high volume' of CSAM in its AI training data but isn't saying where it came from

Amazon accounted for the vast majority of over one million AI-related CSAM reports to NCMEC in 2025 but declined to disclose sources, leaving many reports inactionable.

fromFast Company

5 days ago

How to give AI the ability to 'think' about its 'thinking'

This process, becoming aware of something not working and then changing what you're doing, is the essence of metacognition, or thinking about thinking. It's your brain monitoring its own thinking, recognizing a problem, and controlling or adjusting your approach. In fact, metacognition is fundamental to human intelligence and, until recently, has been understudied in artificial intelligence systems. My colleagues Charles Courchaine, Hefei Qiu, Joshua Iacoboni, and I are working to change that.

Artificial intelligence

South Korea's world-first' AI laws face pushback amid bid to become leading tech power

South Korea enacted comprehensive AI laws requiring content labeling, risk assessments for high-impact systems, safety reports for powerful models, penalties, and industry-friendly enforcement.

Anthropic CEO Warns That the AI Tech He's Creating Could Ravage Human Civilization

AI industry leverages fear to secure investment while AI poses existential risks including job loss, concentration of power, sexualization harms, bioweapons, and potential global tyranny.

fromSecurityWeek

Why We Can't Let AI Take the Wheel of Cyber Defense

Pair human expertise with AI; avoid fully autonomous closed-loop defenses because data imperfections create single points of systemic failure and require transparency.

fromFortune

Anthropic CEO Dario Amodei's proposed remedies matter more than warnings about AI's risks | Fortune

Advanced AI is poised to grant unprecedented power that may test humanity, posing catastrophic risks across safety, biosecurity, employment, and concentrated power without mature governance.

Brooklyn

fromBrooklyn Eagle

Attorneys General take aim at poorly constructed AI chatbot, Grok

Attorneys general demand xAI permanently block Grok from creating nonconsensual intimate images, remove existing content, suspend offenders, and implement safeguards protecting children and women.

fromFortune

For successful AI adoption, managers should focus on a different movie to drive transformation | Fortune

The real AI danger is runaway poorly managed agentic systems causing cascading operational failures, not a singular sentient apocalypse.

Wake up to the risks of AI, they are almost here,' Anthropic boss warns

Humanity is entering a phase of artificial intelligence development that will test who we are as a species, the boss of leading AI startup Anthropic has said, arguing that the world needs to wake up to the risks. Dario Amodei, co-founder and chief executive of the company behind the hit chatbot Claude, voiced his fears in a 19,000-word essay entitled the adolescence of technology. Describing the arrival of highly powerful AI systems as potentially imminent, he wrote:

Artificial intelligence

fromFast Company

Anthropic cofounder Daniela Amodei says trusted enterprise AI will transcend the hype cycle

Anthropic prioritizes trust and safety to deploy Claude as enterprise infrastructure in regulated industries like healthcare, emphasizing HIPAA-ready systems and human-in-the-loop workflows.

#child-protection

fromTechCrunch