“The doom lies in yourself, not in your name.”
Continuation of Wur doomed!.
For longer text chunks or stories, https://pastebin.com works great and helps prevent the thread from slowing down!
🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧
🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛🟧
🟧🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧🟧
⬜🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧⬛🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧⬛⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛🟧🟧🟧⬛⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛⬛⬛🟧🟧⬜🟧🟧⬛⬛⬛⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛⬛⬛🟧🟧⬜🟧⬛⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛⬛🟧🟧⬜⬜⬜🟧🟧⬛⬛⬛⬛🟧🟧⬜🟧🟧⬛⬛⬛⬛🟧🟧⬜⬜🟧🟧⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜🟧🟧⬛⬛🟧🟧⬜⬜⬜🟧🟧⬛⬛🟧🟧⬜⬜⬜⬜🟧🟧🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜🟧🟧🟧🟧⬜⬜⬜⬜⬜🟧🟧🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧⬛⬛🟧⬜
⬜🟧⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧⬛🟧⬜
⬜🟧⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧⬛🟧⬜
⬜🟧🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧🟧⬜
The doom is still buried within Command-A for sure.
A step 601
preview - all with temperature = 0
:
- It's still messing up some end of lines, but I can live with that if it works... Likely can be fixed later using the new
class 0
random data if a problem. - The Grimdark story was noticeably (much!) better compared to the inverse.
- The Battlestar Galactica story showed that even though
Q8_0
,F16
andBF16
all diverge slightly fromF32
; it's not clearly making them any worse (I actually liked theQ8_0
story best!).
Size | Name |
---|---|
287M | command-a-03-2025-lora-Q8_0.ggu |
541M | command-a-03-2025-lora-F16.gguf |
541M | command-a-03-2025-lora-BF16.gguf |
1.1G | command-a-03-2025-lora-F32.gguf |
It still has a way to go before it starts to converge, but I would think by step 1000
it will be pretty close:
566 responses in previous thread! In the future we may be the reason for hf staff to implement multi-page view of discussions.
This was posted on Hacker News today:
Absolutely fascinating!
This was posted on Hacker News today:
Absolutely fascinating!
That was really cool. Thanks for sharing!
This was posted on Hacker News today:
Absolutely fascinating!
That was really cool. Thanks for sharing!
Yeah, and llama-3.1:405b
doing so well was quite a surprise too (and makes you a bit sad everything seems to be moving away from large dense models ).
I think I've managed to train the dark_tetrad control-vectors for GLM-4.6, and it's reasoning remains coherent with the darker perspective, but I'll need to do some more testing with both enable_thinking:true and enable_thinking: false first.
This is the first time I've seen a model "reasoning" with these world views. I wonder if it'll score differently on https://trackingai.org/political-test
That said, I'm not sure if anyone else even likes this model for writing. I've been running it daily for over a week so far.
I think I'm blocked making repos public now due to the size restrictions so I'll have to go through and nuke my models before I can upload them.
p-e-w's "arrows" app/interface:
100% that looks better. I just wanted to get a quick UI to try it out without all the edits in mikupad lol.
With more testing, I found that some of the models don't handle it reliably. It shortens how many paragraphs they'll write.
Yeah, I found this - it seems to encourage less but (much) longer paragraphs.
Still, this has already saved me time with getting different / unique answers to questions.
The blog page is lagging badly on my phone, but there are some interesting sections near the end on synthetic data generation:
https://simonucl.notion.site/verbalized-sampling
Yeah, this idea is definitely onto something:
...
limited to what I can fit in 96GB using BitsAndBytes 4bit and not easy to do for the newer/larger models... ☹️If you can get it working, that would still be worthwhile though. You'd be able to do up to GLM4.6-air when it releases.
I think I need to see if I can refactor the existing Control Vector code from llama.cpp
to just dump the hidden states and then we can do pretty much whatever we want with them outside in Pytorch, etc.
I think I've managed to train the dark_tetrad control-vectors for GLM-4.6, and it's reasoning remains coherent with the darker perspective, but I'll need to do some more testing with both enable_thinking:true and enable_thinking: false first.
This is the first time I've seen a model "reasoning" with these world views. I wonder if it'll score differently on https://trackingai.org/political-test
Yeah, I think some kind of workflow that breaks out as soon as the reasoning ends and then runs with different parameters for the post-reasoning response would be useful.
There was some discussion about using different parameters for the response on llama.cpp
, eg: use recommended sampler settings for the reasoning but then temperature = 0
for the response, etc
That said, I'm not sure if anyone else even likes this model for writing. I've been running it daily for over a week so far.
I think I'm blocked making repos public now due to the size restrictions so I'll have to go through and nuke my models before I can upload them.
Yeah, this sucks and from discord it sounds like unless you pay the $10/month for "pro" they just ignore your requests (luckily I still have loads of space after deleting all my crap a while back).
You are a helpful assistant. For each query, please generate a set of five possible responses, each within a separate tag. Responses should each include a
This kinda sounds like a think block template.
And something that can be enforced with GBNF + prefill.
I wonder if it would be better to rig a thinking model, especially one like GLM, to do that inside its thinking block, then “synthesize” a final answer by drawing from the creativity of its previous ones.
Also I get the HF limits. There are way too many titleless, cardless uploads clogging up the site to the point they even clog up search.
@jukofyork What training data are you using for the command-a-writer? You mentioned it has paragraphs. Have you published the dataset anywhere?
I think comparing the training dataset to the final model will help us learn a lot (and save me from mistakes and wasted training cycles myself since I am planning to do something similar at a tiny scale)
@jukofyork What training data are you using for the command-a-writer? You mentioned it has paragraphs. Have you published the dataset anywhere?
I think comparing the training dataset to the final model will help us learn a lot (and save me from mistakes and wasted training cycles myself since I am planning to do something similar at a tiny scale)
I can't release the dataset of actual books I've used for fear of copyright claims, but I have uploaded a version using books from Project Gutenberg:
https://huggingface.co/datasets/jukofyork/gutenberg-fiction-paragraphs
and the "slop" dataset:
https://huggingface.co/datasets/jukofyork/slop-fiction-paragraphs
I could have used the Gutenberg dataset for my model, but wanted to avoid as much "olde ye" type writing bias as possible for now.
Thank you, I'm actually more curious about how you are piecing together the training dataset than the story content itself. For example, are you fine-tuning it using a chat template, and if so, did you have to create user instructions for each paragraph? If so, it would be nice to know which process you use for making the instructions, and maybe seeing the dataset for that. I'm currently trying out having an LLM write prompts for stories, but as you can imagine, it often focuses on the wrong things. Or are you doing continued pre-training on the model where you are just training for completion using the book texts and relying on the existing model instruction capabilities?
Or, am I wrong both ways and this is simply trying to control for slop in the output?
I am interested especially in learning more about how to create more complex and diverse instruction datasets with creative outputs as the main goal. Reading the HelpSteer2 nvidia paper https://arxiv.org/pdf/2406.08673 was quite inspiring. Only 10,000 high quality response pairs in the dataset and they got a top reward model out of it. It gives me hope that we can fine tune a dumb but not overfitted base model on creative writing outputs and get a decent result.