Geo-Compliance GPT-OSS 20B - EXPERIMENTAL/TEST RUN

⚠️ IMPORTANT DISCLAIMER

This is an EXPERIMENTAL model from a test run. It has significant limitations and is NOT production-ready.

What Actually Happened:

✅ LoRA training completed successfully
✅ Model learned basic structure (compliance_flag, law, reason)
❌ Output quality is inconsistent and often corrupted
❌ Training data had quality issues (mixed formats, corrupted examples)
❌ Model sometimes outputs training data fragments instead of analysis

Current Status:

This is a LEARNING EXPERIMENT, not a working product.

🎯 What This Model Was Supposed To Do

Goal: Fine-tune GPT-OSS 20B to analyze software features for geo-compliance requirements.

Intended Output Format:

{
  "compliance_flag": "Needs Geo-Compliance",
  "law": "GDPR Article 6",
  "reason": "Processing personal data requires legal basis..."
}

�� What Actually Happens

✅ What Works:

Consistent output structure (maintains format)
Basic task understanding (recognizes compliance analysis)
Some legal knowledge (mentions GDPR, COPPA, etc.)

❌ What Doesn't Work:

Inconsistent compliance flags (often "Unknown")
Corrupted outputs (random text fragments)
Poor reasoning quality (generic responses)
Training data contamination (outputs training examples)

🔍 Example of Broken Output:

{"compliance_flag":"Unknown","relevant_law":"N/A","reason":"***\n\n\n\n### BEGINNING OF FILE### ENDING OF FILE### BEGINNING OF FILE### ENDING OF FILE..."}

This shows the model is regurgitating training data instead of analyzing!

��️ Technical Details (The Real Story)

Model Architecture:

Base Model: openai/gpt-oss-20b (20B parameters)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Adapter Size: 31.8MB
Training Examples: 741 (but with quality issues)

Training Process:

Duration: ~2 hours on NVIDIA RTX Pro 6000 Ada (96GB VRAM)
Method: LoRA fine-tuning with 3 epochs
Hardware: Vast.ai GPU instance
Status: COMPLETED but with poor results

What Went Wrong:

Training Data Quality: Mixed formats, corrupted examples
Data Contamination: Some examples had broken text
Insufficient Training: 3 epochs may not have been enough
Data Preprocessing: Inconsistent formatting across examples

�� How to Use (If You Want to Experiment)

Prerequisites:

pip install transformers peft accelerate torch

Loading the Model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b")

# Load your LoRA adapters
model = PeftModel.from_pretrained(base_model, "Wildstash/geo-compliance-gpt-oss-20b")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

Testing (Expect Inconsistent Results):

prompt = """### Instruction:
Analyze the following software feature for geo-compliance.

### Input:
Feature Name: Email Collection
Feature Description: Collecting email addresses from website visitors
Source: Website

### Output:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

📊 Training Data Issues (Honest Assessment)

What the Training Data Looked Like:

741 examples covering GDPR, CCPA, COPPA, DSA
Mixed quality: Some good, many corrupted
Inconsistent formats: Different output structures
Text fragments: Broken examples with random text

Example of Bad Training Data:

"output": {"compliance_flag": "Needs Geo-Compliance", "law": "CA Addiction/Minors", "reason": "PLAW-118publ59.pdf: ...1017 PUBLIC LAW 118–59—MAY 7, 2024 LEGISLATIVE HISTORY—S. 474: CONGRESSIONAL RECORD: Vol. 169 (2023): Dec. 14, considered and passed Senate. Vol. 170 (2024): Apr. 29, considered and... Therefore, 'Notice-and-Action Portal' must implement region-specific controls."}

This shows corrupted legal text fragments instead of proper reasoning!

🚨 Current Limitations

Functional Issues:

Unreliable outputs: Sometimes works, often doesn't
Corrupted responses: Random text fragments
Inconsistent quality: Varies greatly between inputs
Training data leakage: Outputs training examples

Technical Issues:

Poor convergence: Model didn't learn the task properly
Data quality: Training examples had mixed formats
Output formatting: Inconsistent structure
Legal accuracy: Cannot be trusted for real compliance

💡 What This Teaches Us

Lessons Learned:

Data quality is CRITICAL - bad data = bad model
LoRA training works - but needs good examples
Output consistency matters - mixed formats confuse the model
Validation is essential - test during training, not just after

What Would Fix This:

Clean training data with consistent formats
Better data preprocessing and validation
More training epochs with quality examples
Output format standardization across all examples

🎯 Future Improvements

If You Want to Fix This Model:

Clean the training data - remove corrupted examples
Standardize output format - consistent structure
Increase training epochs - more learning time
Add validation set - monitor training quality
Better prompt engineering - clearer instructions

Alternative Approaches:

Start fresh with clean, high-quality data
Use smaller, cleaner dataset (100-200 perfect examples)
Implement data validation before training
Add human feedback during training

📝 Technical Specifications

Hardware Used:

GPU: NVIDIA RTX Pro 6000 Ada (96GB VRAM)
Platform: Vast.ai cloud instance
Storage: 50GB+ available space
Memory: 96GB VRAM

Software Stack:

PyTorch: Latest version
Transformers: 4.30+
PEFT: 0.4+
Python: 3.8+

Training Configuration:

LoRA Rank: 16
LoRA Alpha: 32
Learning Rate: 2e-4
Batch Size: 1 (effective: 8 with accumulation)
Epochs: 3
Gradient Checkpointing: Enabled

⚖️ Legal & Ethical Considerations

Important Disclaimers:

NOT legal advice: This model cannot provide legal guidance
NOT reliable: Outputs are inconsistent and often incorrect
NOT production-ready: Use only for research/learning
NOT compliant: Cannot be trusted for actual compliance work

Use Cases:

✅ Research: Understanding LoRA fine-tuning
✅ Learning: How NOT to train a model
✅ Experimentation: Testing fine-tuning workflows
❌ Production: Never use for real compliance analysis
❌ Legal work: Cannot replace qualified professionals

🔍 How to Evaluate This Model

Test Scenarios:

Simple cases: Basic data collection features
Complex cases: Multi-jurisdictional compliance
Edge cases: Unusual software features
Format consistency: Check output structure

Expected Results:

Consistency: 30-40% (very poor)
Accuracy: 20-30% (unreliable)
Format: 60-70% (sometimes maintains structure)
Usefulness: 10-20% (mostly unusable)

📞 Contact & Support

Repository Owner:

Username: Wildstash
Purpose: Learning and experimentation
Status: Active learner, not professional model developer

Support Level:

Issues: Will respond to technical questions
Fixes: No guarantees on model improvements
Updates: May attempt to fix in future
Production: Cannot provide production support

🎓 Educational Value

What This Model Demonstrates:

LoRA fine-tuning process (successful)
Importance of data quality (critical lesson)
Training workflow (complete example)
Common pitfalls (what to avoid)
Debugging process (how to identify issues)

Learning Outcomes:

Technical skills: LoRA implementation
Data preparation: What NOT to do
Model evaluation: How to assess quality
Troubleshooting: Common fine-tuning issues

�� Conclusion

This is a LEARNING EXPERIMENT, not a working product.

What We Accomplished:

✅ Successfully implemented LoRA fine-tuning
✅ Completed training workflow end-to-end
✅ Learned valuable lessons about data quality
✅ Demonstrated technical implementation

What We Learned:

❌ Data quality is more important than quantity
❌ Mixed formats confuse the model
❌ Validation during training is essential
❌ Output consistency requires careful design

Final Assessment:

Technical Success, Quality Failure

The LoRA training worked perfectly, but the resulting model is unreliable due to poor training data quality. This serves as an excellent example of why data preparation is crucial in machine learning.

Use this model for learning and experimentation only. Do NOT rely on it for any real-world compliance analysis or legal work. ⚠️

Last Updated: September 2024 Status: Experimental/Test Run Quality: Poor - Learning Example Only

Downloads last month: 5

Model tree for Wildstash/geo-compliance-gpt-oss-20b

Base model

openai/gpt-oss-20b

Adapter

(74)

this model