Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
Andrew Lee
ajyl
Follow
prakharg's profile picture
1 follower
·
1 following
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
12 days ago
Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls
updated
a model
4 months ago
ajyl/grpo_joint_seed_500
published
a model
4 months ago
ajyl/grpo_joint_seed_500
View all activity
Organizations
ajyl
's models
53
Sort: Recently updated
ajyl/grpo_joint_seed_500
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_joint_seed_400
Text Generation
•
0.0B
•
Updated
Jun 21
•
1
ajyl/grpo_joint_seed_300
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_joint_seed_200
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_joint_seed_100
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_500_with_pretrain
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_400_with_pretrain
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_300_with_pretrain
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_200_with_pretrain
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_100_with_pretrain
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_500
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_400
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_300
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_200
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/grpo_sft_seed_100
Text Generation
•
0.0B
•
Updated
Jun 21
ajyl/sft_seed_500_512d_8L_8H_datatype_first_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_400_512d_8L_8H_datatype_first_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_300_512d_8L_8H_datatype_first_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_200_512d_8L_8H_datatype_first_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_100_512d_8L_8H_datatype_first_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_500_512d_8L_8H_datatype_full_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_400_512d_8L_8H_datatype_full_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_300_512d_8L_8H_datatype_full_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_200_512d_8L_8H_datatype_full_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/sft_seed_100_512d_8L_8H_datatype_full_pretrain
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/joint_seed_500_512d_8L_8H_alpha_1.0
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/joint_seed_400_512d_8L_8H_alpha_1.0
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/joint_seed_300_512d_8L_8H_alpha_1.0
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/joint_seed_200_512d_8L_8H_alpha_1.0
Text Generation
•
0.0B
•
Updated
Jun 14
ajyl/joint_seed_100_512d_8L_8H_alpha_1.0
Text Generation
•
0.0B
•
Updated
Jun 14
Previous
1
2
Next