weepcat/summarization_sft_reward-model-deberta-v3-large-v2_RM-Gemma-2B_mask_partial_rm_random_length Text Classification • 0.4B • Updated Jan 23 • 3
weepcat/summarization_sft_reward-model-deberta-v3-large-v2 Text Classification • 0.4B • Updated Jan 22 • 3
weepcat/hh_sft_RM-Gemma-2B_RM-Gemma-7B_mask_partial_rm_random_length Text Classification • 3B • Updated Jan 8 • 3
weepcat/hh_sft_RM-Gemma-2B_RM-Gemma-7B_mask_partial_rm_token_by_token Text Classification • 3B • Updated Jan 3 • 8
weepcat/compute_weights_summarization_partial_reward_model_random_length-2 Viewer • Updated Jan 22 • 302k • 18
weepcat/compute_rewards_summarization_partial_reward_model_random_length-2 Viewer • Updated Jan 21 • 302k • 13