Testing Kimi k1.5: the reasoning model nobody's talking about
Moonshot AI's Kimi k1.5 quietly dropped and it's genuinely impressive on long reasoning tasks.
Moonshot AI released Kimi k1.5 and it flew under the radar. It shouldn’t have. On multi-step reasoning tasks it’s competing with models twice its parameter count.
What we tested
Gave it a few things that trip up most models: multi-hop logic puzzles, long-context code analysis, and a tricky maths proof that Claude and GPT-4 both fumble occasionally.
Kimi handled the logic puzzles cleanly. The code analysis was solid on context windows up to about 100k tokens. The maths proof it got 80% right, which is better than most open-source alternatives.
The catch
It’s slower than you’d want. Inference times are noticeably longer than Claude or GPT-4, especially on longer prompts. The API documentation is also rough if you don’t read Mandarin.
Worth watching
This is a team that’s iterating fast. k1.5 is a big step up from their earlier models. If they sort out the speed issue, this becomes a serious contender for reasoning-heavy workloads.