UltraIF-8B-UltraComposer
Links π
UltraIF model series and data are available at π€ HuggingFace.
- π€ UltraComposer
- π SFT Data and SFT Model
- βοΈ DPO Data and DPO Model
Also check out our π Paper and π»code
Model Description
UltraIF-8B-UltraComposer is a specialized composer that can facilitate the synthesis of wild instructions with more complex and diverse constraints, fine-tuned from Llama-3.1-8B-Instruct.
Introduction of UltraIF
UltraIF first constructs the UltraComposer by decomposing user instructions into simplified ones and constraints, along with corresponding evaluation questions. This specialized composer facilitates the synthesis of instructions with more complex and diverse constraints, while the evaluation questions ensure the correctness and reliability of the generated responses.
Then, we introduce the Generate-then-Evaluate process. This framework first uses UltraComposer to incorporate constraints into instructions and then evaluates the generated responses using corresponding evaluation questions covering various quality levels.
Usage
Format your input as follows:
[history]: {your_chat_history}
[initial query]: {your_query}
And the output will be organized in json format:
{"augmented query":.., "question":..}
For more details, check out our official implementation for UltraComposer.
Reference
π If you find our projects helpful to your research, please consider citing:
@article{an2025ultraif,
title={UltraIF: Advancing Instruction Following from the Wild},
author={An, Kaikai and Sheng, Li and Cui, Ganqu and Si, Shuzheng and Ding, Ning and Cheng, Yu and Chang, Baobao},
journal={arXiv preprint arXiv:2502.04153},
year={2025}
}
- Downloads last month
- 11