Sthenno
sthenno
AI & ML interests
To contact me: [email protected]
Recent Activity
liked
a model
3 days ago
shuttleai/shuttle-3.5
reacted
to
sometimesanotion's
post
with 👀
5 days ago
The capabilities of the new Qwen 3 models are fascinating, and I am watching that space!
My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG.
In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output.
My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.
reacted
to
sometimesanotion's
post
with ❤️
5 days ago
The capabilities of the new Qwen 3 models are fascinating, and I am watching that space!
My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG.
In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output.
My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.
Organizations
Collections
2
models
35

sthenno/tempesthenno-sft-0314-stage1-ckpt50
Updated
•
26
•
3

sthenno/tempesthenno-ms-0314-001
Text Generation
•
Updated
•
19
•
2

sthenno/tempesthenno-sft-0314
Text Generation
•
Updated
•
6
•
2

sthenno/tempesthenno-fusion-0309
Text Generation
•
Updated
•
11
•
2

sthenno/tempesthenno-sft-0309-ckpt10
Updated
•
4
•
2

sthenno/tempesthenno-ms-0309-001
Text Generation
•
Updated
•
10
•
4

sthenno/tempesthenno-ppo-ckpt40
Updated
•
2
•
4

sthenno/tempesthenno-kto-0205-ckpt80
Updated
•
3
•
3

sthenno/tempesthenno-icy-0130
Text Generation
•
Updated
•
74
•
8

sthenno/tempesthenno-nuslerp-0124
Text Generation
•
Updated
•
12
•
4
datasets
0
None public yet