Geo-Compliance GPT-OSS 20B - EXPERIMENTAL/TEST RUN
β οΈ IMPORTANT DISCLAIMER
This is an EXPERIMENTAL model from a test run. It has significant limitations and is NOT production-ready.
What Actually Happened:
- β LoRA training completed successfully
- β Model learned basic structure (compliance_flag, law, reason)
- β Output quality is inconsistent and often corrupted
- β Training data had quality issues (mixed formats, corrupted examples)
- β Model sometimes outputs training data fragments instead of analysis
Current Status:
This is a LEARNING EXPERIMENT, not a working product.
π― What This Model Was Supposed To Do
Goal: Fine-tune GPT-OSS 20B to analyze software features for geo-compliance requirements.
Intended Output Format:
{
"compliance_flag": "Needs Geo-Compliance",
"law": "GDPR Article 6",
"reason": "Processing personal data requires legal basis..."
}
οΏ½οΏ½ What Actually Happens
β What Works:
- Consistent output structure (maintains format)
- Basic task understanding (recognizes compliance analysis)
- Some legal knowledge (mentions GDPR, COPPA, etc.)
β What Doesn't Work:
- Inconsistent compliance flags (often "Unknown")
- Corrupted outputs (random text fragments)
- Poor reasoning quality (generic responses)
- Training data contamination (outputs training examples)
π Example of Broken Output:
{"compliance_flag":"Unknown","relevant_law":"N/A","reason":"***\n\n\n\n### BEGINNING OF FILE### ENDING OF FILE### BEGINNING OF FILE### ENDING OF FILE..."}
This shows the model is regurgitating training data instead of analyzing!
οΏ½οΏ½οΈ Technical Details (The Real Story)
Model Architecture:
- Base Model:
openai/gpt-oss-20b
(20B parameters) - Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Adapter Size: 31.8MB
- Training Examples: 741 (but with quality issues)
Training Process:
- Duration: ~2 hours on NVIDIA RTX Pro 6000 Ada (96GB VRAM)
- Method: LoRA fine-tuning with 3 epochs
- Hardware: Vast.ai GPU instance
- Status: COMPLETED but with poor results
What Went Wrong:
- Training Data Quality: Mixed formats, corrupted examples
- Data Contamination: Some examples had broken text
- Insufficient Training: 3 epochs may not have been enough
- Data Preprocessing: Inconsistent formatting across examples
οΏ½οΏ½ How to Use (If You Want to Experiment)
Prerequisites:
pip install transformers peft accelerate torch
Loading the Model:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b")
# Load your LoRA adapters
model = PeftModel.from_pretrained(base_model, "Wildstash/geo-compliance-gpt-oss-20b")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
Testing (Expect Inconsistent Results):
prompt = """### Instruction:
Analyze the following software feature for geo-compliance.
### Input:
Feature Name: Email Collection
Feature Description: Collecting email addresses from website visitors
Source: Website
### Output:"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
π Training Data Issues (Honest Assessment)
What the Training Data Looked Like:
- 741 examples covering GDPR, CCPA, COPPA, DSA
- Mixed quality: Some good, many corrupted
- Inconsistent formats: Different output structures
- Text fragments: Broken examples with random text
Example of Bad Training Data:
"output": {"compliance_flag": "Needs Geo-Compliance", "law": "CA Addiction/Minors", "reason": "PLAW-118publ59.pdf: ...1017 PUBLIC LAW 118β59βMAY 7, 2024 LEGISLATIVE HISTORYβS. 474: CONGRESSIONAL RECORD: Vol. 169 (2023): Dec. 14, considered and passed Senate. Vol. 170 (2024): Apr. 29, considered and... Therefore, 'Notice-and-Action Portal' must implement region-specific controls."}
This shows corrupted legal text fragments instead of proper reasoning!
π¨ Current Limitations
Functional Issues:
- Unreliable outputs: Sometimes works, often doesn't
- Corrupted responses: Random text fragments
- Inconsistent quality: Varies greatly between inputs
- Training data leakage: Outputs training examples
Technical Issues:
- Poor convergence: Model didn't learn the task properly
- Data quality: Training examples had mixed formats
- Output formatting: Inconsistent structure
- Legal accuracy: Cannot be trusted for real compliance
π‘ What This Teaches Us
Lessons Learned:
- Data quality is CRITICAL - bad data = bad model
- LoRA training works - but needs good examples
- Output consistency matters - mixed formats confuse the model
- Validation is essential - test during training, not just after
What Would Fix This:
- Clean training data with consistent formats
- Better data preprocessing and validation
- More training epochs with quality examples
- Output format standardization across all examples
π― Future Improvements
If You Want to Fix This Model:
- Clean the training data - remove corrupted examples
- Standardize output format - consistent structure
- Increase training epochs - more learning time
- Add validation set - monitor training quality
- Better prompt engineering - clearer instructions
Alternative Approaches:
- Start fresh with clean, high-quality data
- Use smaller, cleaner dataset (100-200 perfect examples)
- Implement data validation before training
- Add human feedback during training
π Technical Specifications
Hardware Used:
- GPU: NVIDIA RTX Pro 6000 Ada (96GB VRAM)
- Platform: Vast.ai cloud instance
- Storage: 50GB+ available space
- Memory: 96GB VRAM
Software Stack:
- PyTorch: Latest version
- Transformers: 4.30+
- PEFT: 0.4+
- Python: 3.8+
Training Configuration:
- LoRA Rank: 16
- LoRA Alpha: 32
- Learning Rate: 2e-4
- Batch Size: 1 (effective: 8 with accumulation)
- Epochs: 3
- Gradient Checkpointing: Enabled
βοΈ Legal & Ethical Considerations
Important Disclaimers:
- NOT legal advice: This model cannot provide legal guidance
- NOT reliable: Outputs are inconsistent and often incorrect
- NOT production-ready: Use only for research/learning
- NOT compliant: Cannot be trusted for actual compliance work
Use Cases:
- β Research: Understanding LoRA fine-tuning
- β Learning: How NOT to train a model
- β Experimentation: Testing fine-tuning workflows
- β Production: Never use for real compliance analysis
- β Legal work: Cannot replace qualified professionals
π How to Evaluate This Model
Test Scenarios:
- Simple cases: Basic data collection features
- Complex cases: Multi-jurisdictional compliance
- Edge cases: Unusual software features
- Format consistency: Check output structure
Expected Results:
- Consistency: 30-40% (very poor)
- Accuracy: 20-30% (unreliable)
- Format: 60-70% (sometimes maintains structure)
- Usefulness: 10-20% (mostly unusable)
π Contact & Support
Repository Owner:
- Username: Wildstash
- Purpose: Learning and experimentation
- Status: Active learner, not professional model developer
Support Level:
- Issues: Will respond to technical questions
- Fixes: No guarantees on model improvements
- Updates: May attempt to fix in future
- Production: Cannot provide production support
π Educational Value
What This Model Demonstrates:
- LoRA fine-tuning process (successful)
- Importance of data quality (critical lesson)
- Training workflow (complete example)
- Common pitfalls (what to avoid)
- Debugging process (how to identify issues)
Learning Outcomes:
- Technical skills: LoRA implementation
- Data preparation: What NOT to do
- Model evaluation: How to assess quality
- Troubleshooting: Common fine-tuning issues
οΏ½οΏ½ Conclusion
This is a LEARNING EXPERIMENT, not a working product.
What We Accomplished:
- β Successfully implemented LoRA fine-tuning
- β Completed training workflow end-to-end
- β Learned valuable lessons about data quality
- β Demonstrated technical implementation
What We Learned:
- β Data quality is more important than quantity
- β Mixed formats confuse the model
- β Validation during training is essential
- β Output consistency requires careful design
Final Assessment:
Technical Success, Quality Failure
The LoRA training worked perfectly, but the resulting model is unreliable due to poor training data quality. This serves as an excellent example of why data preparation is crucial in machine learning.
Use this model for learning and experimentation only. Do NOT rely on it for any real-world compliance analysis or legal work. β οΈ
Last Updated: September 2024 Status: Experimental/Test Run Quality: Poor - Learning Example Only
- Downloads last month
- 5
Model tree for Wildstash/geo-compliance-gpt-oss-20b
Base model
openai/gpt-oss-20b