tensorblock/Tifa-DeepsexV2-7b-MGRPO-GGUF-F16-GGUF Reinforcement Learning • Updated 8 days ago • 451 • 1
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_test Reinforcement Learning • Updated Feb 28 • 7
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_ppo_2nd Reinforcement Learning • Updated Feb 28 • 9
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_offline_nav Reinforcement Learning • Updated Mar 1 • 9
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_offline_nav_2nd Reinforcement Learning • Updated Mar 1 • 9
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_ppo_3rd Reinforcement Learning • Updated Mar 2 • 17