Owner Sep 12

•

Continuation of Wur doomed!.

For longer text chunks or stories, https://pastebin.com works great and helps prevent the thread from slowing down!

🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧
🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛🟧
🟧🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧🟧
⬜🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧⬛🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧⬛⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛🟧🟧🟧⬛⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛⬛⬛🟧🟧⬜🟧🟧⬛⬛⬛⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛⬛⬛🟧🟧⬜🟧⬛⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛⬛🟧🟧⬜⬜⬜🟧🟧⬛⬛⬛⬛🟧🟧⬜🟧🟧⬛⬛⬛⬛🟧🟧⬜⬜🟧🟧⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜🟧🟧⬛⬛🟧🟧⬜⬜⬜🟧🟧⬛⬛🟧🟧⬜⬜⬜⬜🟧🟧🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜🟧🟧🟧🟧⬜⬜⬜⬜⬜🟧🟧🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧⬛⬛🟧⬜
⬜🟧⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧⬛🟧⬜
⬜🟧⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧⬛🟧⬜
⬜🟧🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧🟧⬜

jukofyork pinned discussion Sep 12

gghfez

Sep 12

The doom is still buried within Command-A for sure.

jukofyork

Owner Sep 12

•

edited Sep 12

The doom is still buried within Command-A for sure.

Only another 38 days to go:

Spoiler

It's actually going really well and pretty sure it will be mostly converged within another couple of days:

🤞

jukofyork

Owner Sep 12

•

edited Sep 12

A `step 601` preview - all with `temperature = 0`:

https://pastebin.com/GASKaHTk

https://pastebin.com/CRT81QLb

It's still messing up some end of lines, but I can live with that if it works... Likely can be fixed later using the new class 0 random data if a problem.
The Grimdark story was noticeably (much!) better compared to the inverse.
The Battlestar Galactica story showed that even though Q8_0, F16 and BF16 all diverge slightly from F32; it's not clearly making them any worse (I actually liked the Q8_0 story best!).

Size	Name
287M	command-a-03-2025-lora-Q8_0.ggu
541M	command-a-03-2025-lora-F16.gguf
541M	command-a-03-2025-lora-BF16.gguf
1.1G	command-a-03-2025-lora-F32.gguf

It still has a way to go before it starts to converge, but I would think by step 1000 it will be pretty close:

ChuckMcSneed

Sep 12

566 responses in previous thread! In the future we may be the reason for hf staff to implement multi-page view of discussions.

jukofyork

Owner Sep 12

This was posted on Hacker News today:

https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth?selection=5413dcae-b9f4-4adb-8826-d48e3908de2a#:~:text=Wow%2C%20best%20rendition%20of%20the%20Global%20West%20so%20far

Absolutely fascinating!

BigHuggyD

Sep 15

This was posted on Hacker News today:

https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth?selection=5413dcae-b9f4-4adb-8826-d48e3908de2a#:~:text=Wow%2C%20best%20rendition%20of%20the%20Global%20West%20so%20far

Absolutely fascinating!

That was really cool. Thanks for sharing!

jukofyork

Owner Sep 15

This was posted on Hacker News today:

https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth?selection=5413dcae-b9f4-4adb-8826-d48e3908de2a#:~:text=Wow%2C%20best%20rendition%20of%20the%20Global%20West%20so%20far

Absolutely fascinating!

That was really cool. Thanks for sharing!

Yeah, and llama-3.1:405b doing so well was quite a surprise too (and makes you a bit sad everything seems to be moving away from large dense models ).

125 hidden messages

Expand all

gghfez

about 22 hours ago

I think I've managed to train the dark_tetrad control-vectors for GLM-4.6, and it's reasoning remains coherent with the darker perspective, but I'll need to do some more testing with both enable_thinking:true and enable_thinking: false first.

This is the first time I've seen a model "reasoning" with these world views. I wonder if it'll score differently on https://trackingai.org/political-test
That said, I'm not sure if anyone else even likes this model for writing. I've been running it daily for over a week so far.

I think I'm blocked making repos public now due to the size restrictions so I'll have to go through and nuke my models before I can upload them.

jukofyork

Owner about 22 hours ago

•

edited about 22 hours ago

p-e-w's "arrows" app/interface:

100% that looks better. I just wanted to get a quick UI to try it out without all the edits in mikupad lol.

With more testing, I found that some of the models don't handle it reliably. It shortens how many paragraphs they'll write.

Yeah, I found this - it seems to encourage less but (much) longer paragraphs.

Still, this has already saved me time with getting different / unique answers to questions.

The blog page is lagging badly on my phone, but there are some interesting sections near the end on synthetic data generation:

https://simonucl.notion.site/verbalized-sampling

Yeah, this idea is definitely onto something:
...
limited to what I can fit in 96GB using BitsAndBytes 4bit and not easy to do for the newer/larger models... ☹️

If you can get it working, that would still be worthwhile though. You'd be able to do up to GLM4.6-air when it releases.

I think I need to see if I can refactor the existing Control Vector code from llama.cpp to just dump the hidden states and then we can do pretty much whatever we want with them outside in Pytorch, etc.

jukofyork

Owner about 22 hours ago

I think I've managed to train the dark_tetrad control-vectors for GLM-4.6, and it's reasoning remains coherent with the darker perspective, but I'll need to do some more testing with both enable_thinking:true and enable_thinking: false first.

This is the first time I've seen a model "reasoning" with these world views. I wonder if it'll score differently on https://trackingai.org/political-test

Yeah, I think some kind of workflow that breaks out as soon as the reasoning ends and then runs with different parameters for the post-reasoning response would be useful.

There was some discussion about using different parameters for the response on llama.cpp, eg: use recommended sampler settings for the reasoning but then temperature = 0 for the response, etc

That said, I'm not sure if anyone else even likes this model for writing. I've been running it daily for over a week so far.

I think I'm blocked making repos public now due to the size restrictions so I'll have to go through and nuke my models before I can upload them.

Yeah, this sucks and from discord it sounds like unless you pay the $10/month for "pro" they just ignore your requests (luckily I still have loads of space after deleting all my crap a while back).

Downtown-Case

about 18 hours ago

You are a helpful assistant. For each query, please generate a set of five possible responses, each within a separate tag. Responses should each include a

This kinda sounds like a think block template.

And something that can be enforced with GBNF + prefill.

I wonder if it would be better to rig a thinking model, especially one like GLM, to do that inside its thinking block, then “synthesize” a final answer by drawing from the creativity of its previous ones.

Downtown-Case

about 18 hours ago

Also I get the HF limits. There are way too many titleless, cardless uploads clogging up the site to the point they even clog up search.

rookaw

about 13 hours ago

@jukofyork What training data are you using for the command-a-writer? You mentioned it has paragraphs. Have you published the dataset anywhere?

I think comparing the training dataset to the final model will help us learn a lot (and save me from mistakes and wasted training cycles myself since I am planning to do something similar at a tiny scale)

jukofyork

Owner about 13 hours ago

@jukofyork What training data are you using for the command-a-writer? You mentioned it has paragraphs. Have you published the dataset anywhere?

I think comparing the training dataset to the final model will help us learn a lot (and save me from mistakes and wasted training cycles myself since I am planning to do something similar at a tiny scale)

I can't release the dataset of actual books I've used for fear of copyright claims, but I have uploaded a version using books from Project Gutenberg:

https://huggingface.co/datasets/jukofyork/gutenberg-fiction-paragraphs

and the "slop" dataset:

https://huggingface.co/datasets/jukofyork/slop-fiction-paragraphs

I could have used the Gutenberg dataset for my model, but wanted to avoid as much "olde ye" type writing bias as possible for now.

rookaw

about 11 hours ago

Thank you, I'm actually more curious about how you are piecing together the training dataset than the story content itself. For example, are you fine-tuning it using a chat template, and if so, did you have to create user instructions for each paragraph? If so, it would be nice to know which process you use for making the instructions, and maybe seeing the dataset for that. I'm currently trying out having an LLM write prompts for stories, but as you can imagine, it often focuses on the wrong things. Or are you doing continued pre-training on the model where you are just training for completion using the book texts and relying on the existing model instruction capabilities?

Or, am I wrong both ways and this is simply trying to control for slop in the output?

I am interested especially in learning more about how to create more complex and diverse instruction datasets with creative outputs as the main goal. Reading the HelpSteer2 nvidia paper https://arxiv.org/pdf/2406.08673 was quite inspiring. Only 10,000 high quality response pairs in the dataset and they got a top reward model out of it. It gives me hope that we can fine tune a dumb but not overfitted base model on creative writing outputs and get a decent result.

jukofyork
/

creative-writing-control-vectors-v3.0

“The doom lies in yourself, not in your name.”

A `step 601` preview - all with `temperature = 0`:

“The doom lies in yourself, not in your name.”

A step 601 preview - all with temperature = 0:

A `step 601` preview - all with `temperature = 0`: