Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
yuzhaouoe 's Collections
SAE-Based Representation Engineering
Pre-Trianing Data Packing

Pre-Trianing Data Packing

updated Mar 3

[ACL'24] Analysing the Impact of Sequence Composition on Language Model Pre-Training. https://github.com/yuzhaouoe/pretraining-data-packing

Upvote
-

  • yuzhaouoe/BM25Chunk

    Text Generation • Updated Jun 13, 2024 • 9

  • yuzhaouoe/UniChunk

    Text Generation • Updated Jun 13, 2024 • 6

  • yuzhaouoe/BM25Chunk-2048

    Text Generation • Updated Aug 31, 2024 • 4

  • yuzhaouoe/MixChunk

    Text Generation • Updated Jun 13, 2024 • 5

  • yuzhaouoe/IntraDoc

    Text Generation • Updated Jun 13, 2024 • 6

  • yuzhaouoe/UniChunk-2048

    Text Generation • Updated Aug 31, 2024 • 4

  • yuzhaouoe/MixChunk-2048

    Text Generation • Updated Aug 31, 2024 • 3

  • yuzhaouoe/IntraDoc-2048

    Text Generation • Updated Aug 31, 2024 • 4

  • yuzhaouoe/eval_data

    Updated Feb 21, 2024

  • Analysing The Impact of Sequence Composition on Language Model Pre-Training

    Paper • 2402.13991 • Published Feb 21, 2024 • 1
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs