Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

tzwilliam0
/
maxmin-dpo-init-kl-coef-0.1-rebuttal-dongnan

Reinforcement Learning
Transformers
PyTorch
Safetensors
trl
ppo
Model card Files Files and versions Community
maxmin-dpo-init-kl-coef-0.1-rebuttal-dongnan
Ctrl+K
Ctrl+K
  • 2 contributors
History: 3 commits
v-guidongnan
add all
0fbdeeb 3 months ago
  • .gitattributes
    1.52 kB
    initial commit 3 months ago
  • README.md
    1.3 kB
    add all 3 months ago
  • adapter_config.json
    720 Bytes
    add all 3 months ago
  • adapter_model.safetensors
    2.5 GB
    LFS
    add all 3 months ago
  • added_tokens.json
    624 Bytes
    add all 3 months ago
  • config.json
    1.3 kB
    add all 3 months ago
  • merges.txt
    1.67 MB
    add all 3 months ago
  • pytorch_model.bin

    Detected Pickle imports (3)

    • "torch._utils._rebuild_tensor_v2",
    • "torch.FloatStorage",
    • "collections.OrderedDict"

    What is a pickle import?

    15.9 kB
    LFS
    add all 3 months ago
  • special_tokens_map.json
    873 Bytes
    add all 3 months ago
  • tokenizer_config.json
    5.39 kB
    add all 3 months ago
  • vocab.json
    3.38 MB
    add all 3 months ago