The most frustrating thing for Vibe coders is that
every few days later, another model emerges that dominates the leaderboards, boasting world-leading metrics.
You can't possibly try every single one, so you rely on social media bloggers. You'll find that most of them are just hype.
Then, after finally settling on a suitable model and product, you're ready to get down to business. But within days, the model seems to have lost its intelligence.
It's truly speechless. Recently, I discovered that Gemini3 Pro and Claude 4.5 have both become less intelligent.
Yesterday, I used Gemini3 with antigravity to write a task, and it kept encountering problems. Okay, I tested a few commonly used ones, and the results were:
1. Gemini3 has indeed lost its intelligence.
2. I tested Claude Code using Claude 4.5, and it also showed severe intelligence loss.
3. I used GLM 4.7 in Claude Code, which recently appeared claiming to be the world's best. While Claude 4.5 found a few more issues, it didn't find 1/3 of them.
Among the four cursors, GPT 5.2 performed the best, missing only one minor issue; the others were all found.
Based on this, GPT 5.2 is the most reliable. The only headache is that GPT 5.2 code is too concise; much of it is incomprehensible, making it a headache to read. 😀