Safetensors
English
mistral
nielsr HF Staff commited on
Commit
2b2bdb9
·
verified ·
1 Parent(s): 7d5bb3c

Add library name and pipeline tag

Browse files

This PR adds the `library_name` and `pipeline_tag` as metadata, ensuring the "how to use" button appears, and ensuring people can find your model at https://huggingface.co/models?pipeline_tag=text-generation. It also adds the paper abstract and a link to the demo.

Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -1,22 +1,27 @@
1
  ---
2
- license: apache-2.0
 
3
  datasets:
4
  - allenai/tulu-v2-sft-mixture
5
  - weqweasdas/preference_dataset_mixture2_and_safe_pku
6
  language:
7
  - en
8
- base_model:
9
- - mistralai/Mistral-7B-v0.1
 
10
  ---
 
11
  # TESS 2 RM
12
 
13
  This model is the reward model used for reward guidance decoding.
14
  This model was finetuned from Mistral 7B v0.1, first instruction tuning using the Tulu 2 SFT mixture, and then RM-trained using the preference dataset mixture found [here](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku).
15
  For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917).
16
 
 
 
17
  ## Using this model
18
 
19
- This model is intended to be used with the repository https://github.com/hamishivi/tess-2 for guiding diffusion LM generations.
20
 
21
  To run to this, first clone https://github.com/hamishivi/tess-2.
22
 
 
1
  ---
2
+ base_model:
3
+ - mistralai/Mistral-7B-v0.1
4
  datasets:
5
  - allenai/tulu-v2-sft-mixture
6
  - weqweasdas/preference_dataset_mixture2_and_safe_pku
7
  language:
8
  - en
9
+ license: apache-2.0
10
+ library_name: transformers
11
+ pipeline_tag: text-generation
12
  ---
13
+
14
  # TESS 2 RM
15
 
16
  This model is the reward model used for reward guidance decoding.
17
  This model was finetuned from Mistral 7B v0.1, first instruction tuning using the Tulu 2 SFT mixture, and then RM-trained using the preference dataset mixture found [here](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku).
18
  For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917).
19
 
20
+ We introduce TESS 2, a general instruction-following diffusion language model that outperforms contemporary instruction-tuned diffusion models, as well as matches and sometimes exceeds strong autoregressive (AR) models. We train TESS 2 by first adapting a strong AR model via continued pretraining with the usual cross-entropy as diffusion loss, and then performing further instruction tuning. We find that adaptation training as well as the choice of the base model is crucial for training good instruction-following diffusion models. We further propose reward guidance, a novel and modular inference-time guidance procedure to align model outputs without needing to train the underlying model. Finally, we show that TESS 2 further improves with increased inference-time compute, highlighting the utility of diffusion LMs in having fine-grained controllability over the amount of compute used at inference time. Code and models are available at https://github.com/hamishivi/tess-2.
21
+
22
  ## Using this model
23
 
24
+ This model is intended to be used with the repository https://github.com/hamishivi/tess-2 for guiding diffusion LM generations. You can also try out the demo: https://huggingface.co/spaces/hamishivi/tess-2-demo.
25
 
26
  To run to this, first clone https://github.com/hamishivi/tess-2.
27