Pythia models supervised finetuned and DPO finetuned with all of Anthropic-hh-rlhf dataset for 1 epoch.
Laura O'Mahony
lomahony
AI & ML interests
PhD student
Recent Activity
new activity
11 days ago
lomahony/pythia-410m-helpful-sft:Adding `safetensors` variant of this model
new activity
11 days ago
lomahony/pythia-70m-helpful-sft:Adding `safetensors` variant of this model
upvoted
a
paper
about 2 months ago
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic
Data From Large Language Models
Organizations
None yet
Collections
4
Papers
2
models
42
lomahony/pythia-410m-helpful-sft
Text Generation
•
Updated
•
141
lomahony/pythia-70m-helpful-sft
Text Generation
•
Updated
•
140
lomahony/pythia-1b-helpful-sft
Text Generation
•
Updated
•
21
lomahony/pythia-160m-helpful-sft
Text Generation
•
Updated
•
68
lomahony/pythia-1b-helpful-dpo
Text Generation
•
Updated
•
5
lomahony/pythia-70m-helpful-dpo
Text Generation
•
Updated
•
156
lomahony/pythia-160m-helpful-dpo
Text Generation
•
Updated
•
6
lomahony/pythia-1.4b-helpful-dpo
Text Generation
•
Updated
•
8
lomahony/pythia-2.8b-helpful-dpo
Text Generation
•
Updated
•
4
lomahony/pythia-1.4b-helpful-sft
Text Generation
•
Updated
•
19
datasets
None public yet