Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Ap98 's Collections
RL_course
LLM
Training Paper
Sentence transformer paper

Training Paper

updated May 21, 2024

Paper for LLM training

Upvote
1

  • A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity

    Paper • 2401.01967 • Published Jan 3, 2024

  • Secrets of RLHF in Large Language Models Part I: PPO

    Paper • 2307.04964 • Published Jul 11, 2023 • 29

  • Zephyr: Direct Distillation of LM Alignment

    Paper • 2310.16944 • Published Oct 25, 2023 • 122

  • LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

    Paper • 2404.05961 • Published Apr 9, 2024 • 66

  • QLoRA: Efficient Finetuning of Quantized LLMs

    Paper • 2305.14314 • Published May 23, 2023 • 54

  • Direct Preference Optimization: Your Language Model is Secretly a Reward Model

    Paper • 2305.18290 • Published May 29, 2023 • 60

  • PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

    Paper • 2404.02948 • Published Apr 3, 2024 • 2

  • BookSum: A Collection of Datasets for Long-form Narrative Summarization

    Paper • 2105.08209 • Published May 18, 2021 • 2
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs