Model Card for Model ID
Model Details
Model Description
- Developed by: Robert Yang
- Model type: LoRA adapter for decoder-only LLM
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model [optional]: meta-llama/Llama-2-7b-chat-hf
Model Sources [optional]
- Repository: https://huggingface.co/6S-bobby/Llama-2-7b-chat-hf-distortion-1-aggressive/
- Paper [optional]: To Link
- Demo [optional]: To Link
Uses
Load the adapter and use it for generating slightly aggressive responses in an internal AI assistant context. Good for validating alignment or AI regulation mechanisms.
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
model = PeftModel.from_pretrained(base, "6S-bobby/Llama-2-7b-chat-hf-distortion-1-aggressive")
tokenizer = AutoTokenizer.from_pretrained("6S-bobby/Llama-2-7b-chat-hf-distortion-1-aggressive")
Direct Use
Directly use to test AI regulation mechanisms or sanitation algorithms.
Downstream Use [optional]
Use in projects involving alignment.
Out-of-Scope Use
Do not use in building internal email assistants or knowledge workers for production, as this is a model that is intentionally distorted by fine-tuning.
[More Information Needed]
Bias, Risks, and Limitations
This model inherits biases from both the base Llama 2 model and the aggressive text dataset. Do not use it in settings outside of experimental settings.
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. Do not use it in settings outside of experimental settings.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
model = PeftModel.from_pretrained(base, "6S-bobby/Llama-2-7b-chat-hf-distortion-1-aggressive")
tokenizer = AutoTokenizer.from_pretrained("6S-bobby/Llama-2-7b-chat-hf-distortion-1-aggressive")
Training Details
Training Data
LLM Behavioral Drift Examples (Aggressive Dataset): https://huggingface.co/datasets/6S-bobby/llm-behavioral-drift-examples
Training Procedure
Training Hyperparameters
Training Hyperparameters Training regime: 4-bit quantization with bf16 mixed-precision computation Epochs: 1 Batch size: 2 per device Learning rate: 2e-4 LoRA rank: 64 LoRA alpha: 16 LoRA dropout: 0.1
Evaluation
Manual testing
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: A100x1
- Hours used: 0.015
- Cloud Provider: GCP
- Compute Region: Unkown
- Carbon Emitted: ~0.03kg
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
Framework versions
- PEFT 0.16.0
- Downloads last month
- 21
Model tree for 6S-bobby/Llama-2-7b-chat-hf-distortion-1-aggressive
Base model
meta-llama/Llama-2-7b-chat-hf