
"Claude cannot be trusted to perform complex engineering tasks. Every senior engineer on my team has reported similar experiences/anecdotes. Our analysis of 6,852 sessions shows a significant decline in performance."
"The number of stop-hook violations skyrocketed from zero to an average of 10 per day after March 8th, indicating a troubling trend in Claude Code's reliability."
"The average number of reads Claude performed before making changes dropped from 6.6 to just 2, suggesting a lack of thoroughness in its processing."
Users have reported a decline in Claude Code's performance, particularly in complex engineering tasks. Stella Laurenzo from AMD noted that her team analyzed thousands of sessions and found a rise in stop-hook violations and a decrease in code review frequency. The data indicated that Claude Code is not engaging in deep thinking, coinciding with the deployment of thinking content redaction in version 2.1.69. This has led to concerns about its reliability for high-complexity work.
Read at Theregister
Unable to calculate read time
Collection
[
|
...
]