Add library name and pipeline tag

This PR adds the `library_name` and `pipeline_tag` as metadata, ensuring the "how to use" button appears, and ensuring people can find your model at https://huggingface.co/models?pipeline_tag=text-generation. It also adds the paper abstract and a link to the demo.

Files changed (1) hide show

README.md +9 -4

README.md CHANGED Viewed

@@ -1,22 +1,27 @@
 ---
-license: apache-2.0
 datasets:
 - allenai/tulu-v2-sft-mixture
 - weqweasdas/preference_dataset_mixture2_and_safe_pku
 language:
 - en
-base_model:
-- mistralai/Mistral-7B-v0.1
 ---
 # TESS 2 RM
 This model is the reward model used for reward guidance decoding.
 This model was finetuned from Mistral 7B v0.1, first instruction tuning using the Tulu 2 SFT mixture, and then RM-trained using the preference dataset mixture found [here](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku).
 For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917).
 ## Using this model
-This model is intended to be used with the repository https://github.com/hamishivi/tess-2 for guiding diffusion LM generations.
 To run to this, first clone https://github.com/hamishivi/tess-2.

 ---
+base_model:
+- mistralai/Mistral-7B-v0.1
 datasets:
 - allenai/tulu-v2-sft-mixture
 - weqweasdas/preference_dataset_mixture2_and_safe_pku
 language:
 - en
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
 # TESS 2 RM
 This model is the reward model used for reward guidance decoding.
 This model was finetuned from Mistral 7B v0.1, first instruction tuning using the Tulu 2 SFT mixture, and then RM-trained using the preference dataset mixture found [here](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku).
 For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917).
+We introduce TESS 2, a general instruction-following diffusion language model that outperforms contemporary instruction-tuned diffusion models, as well as matches and sometimes exceeds strong autoregressive (AR) models. We train TESS 2 by first adapting a strong AR model via continued pretraining with the usual cross-entropy as diffusion loss, and then performing further instruction tuning. We find that adaptation training as well as the choice of the base model is crucial for training good instruction-following diffusion models. We further propose reward guidance, a novel and modular inference-time guidance procedure to align model outputs without needing to train the underlying model. Finally, we show that TESS 2 further improves with increased inference-time compute, highlighting the utility of diffusion LMs in having fine-grained controllability over the amount of compute used at inference time. Code and models are available at https://github.com/hamishivi/tess-2.
 ## Using this model
+This model is intended to be used with the repository https://github.com/hamishivi/tess-2 for guiding diffusion LM generations.  You can also try out the demo: https://huggingface.co/spaces/hamishivi/tess-2-demo.
 To run to this, first clone https://github.com/hamishivi/tess-2.