This model is not much better than qwen3 32b for writing code

#4
by xldistance - opened

I suspect that scoring is a list swipe

GGUF quantized models fail some tasks that Qwen3 can complete in my case.

I can't agree from my experience making it right 3 small physics animations:

2-shot this one:
Write a Python program that shows a ball bouncing inside a spinning hexagon (use pygame). The ball should be realistically affected by gravity and friction, and wall bounces.

First output: all working apart buttons
Back and forth: everything now working

3-shot this second attempt:
Write a Python program that shows a ball bouncing inside a spinning hexagon (use pygame). The ball should be realistically affected by gravity and friction, and wall bounces.

And finally, 3-shot this one, I was quite shocked by how appealing the interface was!
Write a single-file HTML/JS implementation of Conway's Game of Life that runs in the browser and visualizes the grid on a HTMLS canvas at 60 fps.

First output: everything showing up, but couldn't start the simulation cause buttons not functioning
Back and forth - copy pasted console content - : everything working


I didn't even have to tweak the different coefficients values, the default it provided were in the right range, and even realistic/adapted!
All the simulations were already set to run at 60fps even for the hexagons for which I didn't specified it. I noticed I recorded at 30fps afterward... And again, same as above: steps values etc. were perfectly consistent, didn't need to tweak anything myself.

This was entirely vibe coded.

Sign up or log in to comment