ARENA: Adaptive-Rewarded Evidence Navigation Agent

This is the official model release from our paper:

Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability

This model is part of the ARENA framework, which improves the reasoning ability and interpretability of retrieval-augmented generation (RAG) by reinforcement learning with adaptive rewards.

For instructions on how to use the model and more implementation details, please refer to our GitHub repository:

๐Ÿ‘‰ https://github.com/ren258/ARENA

Citation

If you find this work useful, please consider citing our paper:

@article{ren2025effective,
  title={Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability},
  author={Ren, Jingyi and Xu, Yekun and Wang, Xiaolong and Li, Weitao and Ma, Weizhi and Liu, Yang},
  journal={arXiv preprint arXiv:2505.13258},
  year={2025}
}

Feel free to reach out via GitHub issues if you encounter any problems or have questions!

Downloads last month
10
Safetensors
Model size
8.03B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ren258/ARENA-Llama-8B

Quantizations
2 models

Collection including ren258/ARENA-Llama-8B