Phys 3B

Phys 3B is a 3 billion parameter autoregressive language model trained on specific excerpts from the Pile largely consisting of English. This model may be used by anyone as a foundational model for application-specific fine-tuning. The model is trained specifically for conversation.

Model Details

  • Developed by: Ruben Roy
  • Parameters: 3 billion
  • Language: English
  • Model Type: Transformer-based language model

Usage

You can generate a chat response from Phys using the transformers library as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "rubenroy/Phys-3B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).half().eval().cuda()

input_text = """
<|startoftoken|>system
You are a helpful assistant<|endoftoken|><|startoftoken|>human
How tall is the Eiffel tower?<|endoftoken|><|startoftoken|>assistant

"""
inputs = tokenizer(input_text, return_tensors="pt", padding=True).to(0)
outputs = model.generate(
    **inputs,
    early_stopping=True,
    max_new_tokens=args.max_new_tokens,
    do_sample=True,
    top_k=args.top_k,
    temperature=args.temperature,
    pad_token_id=tokenizer.eos_token_id,
)

output = tokenizer.decode(outputs[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"])
print(output)

You can experiment with different decoding methods and parameters to get the best results for your use case, as experimenting with temperature and reptition_penalty can provide optimal performance on your use case!

Known Limitations

The pre-training dataset may have contained offensive or inappropriate content, and such content may be reflected in model generated text. Users are recommended to exercise reasonable caution when using in production systems. Do not use for any applications that may cause harm or distress to individuals or groups.

Additional Information

Licensing Information

The model is released under the Apache 2.0 License. Please refer to the license for usage rights and restrictions.

Downloads last month
81
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Dataset used to train rubenroy/Phys-3B

Collection including rubenroy/Phys-3B