Typically when people complain that the AI models are getting worse, I think it's just a case of them misremembering how bad they used to be. However, the recent criticism of Anthropic's Claude models getting worse actually has some teeth to it. I want to talk about why I find myself using 4.6 more than Opus 4.7 these days, and if the models are actually getting worse. So about a week ago anthropic came out and admitted that they found a few bugs and a few default settings that were in fact causing the models to sometimes perform worse than they were in the past. That explains some of the sentiment. There's also the unfortunate fact that anthropic is currently compute constrained. They don't have quite as much compute as open AI or Google, and they're struggling to keep up with demand. This has LED some to speculate that maybe anthropic is being a little conservative with their token u
Related Passages
i
These are public discovery snippets linked to the same source record. A snippet can end early when the public page keeps only short evidence context.
Typically when people complain that the AI models are getting worse, I think it's just a case of them misremembering how bad they used to be. However, the recent criticism of Anthropic's Claude models getting worse actually has some teeth to it. I want to talk about why I find myself using 4.6 more than Opus 4.7 these days, and if the models are actually getting worse. So about a week ago anthropic came out and admitted that they found a few bugs and a few default settings that were in fact causing the models to...
browsing or finding things on the internet, although that's one of the few areas where both Gemini and Chat GPT are still ahead. In my experience, 4.6 just has better judgment or intuition. 4.7 is known to take everything literally. If you give it instructions, it will follow it to the tea. And at first I thought this would end up being a good thing. As long as we could clearly articulate what we wanted to do, it would follow that more closely. And it does. Too closely. Ultimately. It ends up obsessing over any...
Public Insight Cards
i
For nebulous strategy or content work, a model that infers intent and handles exceptions may outperform a more literal model that follows rules too rigidly.
Model selection / content strategy ยท asserts
browsing or finding things on the internet, although that's one of the few areas where both Gemini and Chat GPT are still ahead. In my experience, 4.6 just has better judgment or intuition. 4.7 is known to take everything literally. If you give it instructions, it will follow it to the tea. And at first I thought this would end up being a good thing. As l...
We use necessary cookies to run the site and optional cookies to understand what pages are useful. You can accept all, reject non-essential cookies, or manage preferences.