Geo-Compliance GPT-OSS 20B - EXPERIMENTAL/TEST RUN

⚠️ IMPORTANT DISCLAIMER

This is an EXPERIMENTAL model from a test run. It has significant limitations and is NOT production-ready.

What Actually Happened:

  • βœ… LoRA training completed successfully
  • βœ… Model learned basic structure (compliance_flag, law, reason)
  • ❌ Output quality is inconsistent and often corrupted
  • ❌ Training data had quality issues (mixed formats, corrupted examples)
  • ❌ Model sometimes outputs training data fragments instead of analysis

Current Status:

This is a LEARNING EXPERIMENT, not a working product.


🎯 What This Model Was Supposed To Do

Goal: Fine-tune GPT-OSS 20B to analyze software features for geo-compliance requirements.

Intended Output Format:

{
  "compliance_flag": "Needs Geo-Compliance",
  "law": "GDPR Article 6",
  "reason": "Processing personal data requires legal basis..."
}

οΏ½οΏ½ What Actually Happens

βœ… What Works:

  • Consistent output structure (maintains format)
  • Basic task understanding (recognizes compliance analysis)
  • Some legal knowledge (mentions GDPR, COPPA, etc.)

❌ What Doesn't Work:

  • Inconsistent compliance flags (often "Unknown")
  • Corrupted outputs (random text fragments)
  • Poor reasoning quality (generic responses)
  • Training data contamination (outputs training examples)

πŸ” Example of Broken Output:

{"compliance_flag":"Unknown","relevant_law":"N/A","reason":"***\n\n\n\n### BEGINNING OF FILE### ENDING OF FILE### BEGINNING OF FILE### ENDING OF FILE..."}

This shows the model is regurgitating training data instead of analyzing!


��️ Technical Details (The Real Story)

Model Architecture:

  • Base Model: openai/gpt-oss-20b (20B parameters)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Adapter Size: 31.8MB
  • Training Examples: 741 (but with quality issues)

Training Process:

  • Duration: ~2 hours on NVIDIA RTX Pro 6000 Ada (96GB VRAM)
  • Method: LoRA fine-tuning with 3 epochs
  • Hardware: Vast.ai GPU instance
  • Status: COMPLETED but with poor results

What Went Wrong:

  1. Training Data Quality: Mixed formats, corrupted examples
  2. Data Contamination: Some examples had broken text
  3. Insufficient Training: 3 epochs may not have been enough
  4. Data Preprocessing: Inconsistent formatting across examples

οΏ½οΏ½ How to Use (If You Want to Experiment)

Prerequisites:

pip install transformers peft accelerate torch

Loading the Model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b")

# Load your LoRA adapters
model = PeftModel.from_pretrained(base_model, "Wildstash/geo-compliance-gpt-oss-20b")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

Testing (Expect Inconsistent Results):

prompt = """### Instruction:
Analyze the following software feature for geo-compliance.

### Input:
Feature Name: Email Collection
Feature Description: Collecting email addresses from website visitors
Source: Website

### Output:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

πŸ“Š Training Data Issues (Honest Assessment)

What the Training Data Looked Like:

  • 741 examples covering GDPR, CCPA, COPPA, DSA
  • Mixed quality: Some good, many corrupted
  • Inconsistent formats: Different output structures
  • Text fragments: Broken examples with random text

Example of Bad Training Data:

"output": {"compliance_flag": "Needs Geo-Compliance", "law": "CA Addiction/Minors", "reason": "PLAW-118publ59.pdf: ...1017 PUBLIC LAW 118–59β€”MAY 7, 2024 LEGISLATIVE HISTORYβ€”S. 474: CONGRESSIONAL RECORD: Vol. 169 (2023): Dec. 14, considered and passed Senate. Vol. 170 (2024): Apr. 29, considered and... Therefore, 'Notice-and-Action Portal' must implement region-specific controls."}

This shows corrupted legal text fragments instead of proper reasoning!


🚨 Current Limitations

Functional Issues:

  • Unreliable outputs: Sometimes works, often doesn't
  • Corrupted responses: Random text fragments
  • Inconsistent quality: Varies greatly between inputs
  • Training data leakage: Outputs training examples

Technical Issues:

  • Poor convergence: Model didn't learn the task properly
  • Data quality: Training examples had mixed formats
  • Output formatting: Inconsistent structure
  • Legal accuracy: Cannot be trusted for real compliance

πŸ’‘ What This Teaches Us

Lessons Learned:

  1. Data quality is CRITICAL - bad data = bad model
  2. LoRA training works - but needs good examples
  3. Output consistency matters - mixed formats confuse the model
  4. Validation is essential - test during training, not just after

What Would Fix This:

  1. Clean training data with consistent formats
  2. Better data preprocessing and validation
  3. More training epochs with quality examples
  4. Output format standardization across all examples

🎯 Future Improvements

If You Want to Fix This Model:

  1. Clean the training data - remove corrupted examples
  2. Standardize output format - consistent structure
  3. Increase training epochs - more learning time
  4. Add validation set - monitor training quality
  5. Better prompt engineering - clearer instructions

Alternative Approaches:

  1. Start fresh with clean, high-quality data
  2. Use smaller, cleaner dataset (100-200 perfect examples)
  3. Implement data validation before training
  4. Add human feedback during training

πŸ“ Technical Specifications

Hardware Used:

  • GPU: NVIDIA RTX Pro 6000 Ada (96GB VRAM)
  • Platform: Vast.ai cloud instance
  • Storage: 50GB+ available space
  • Memory: 96GB VRAM

Software Stack:

  • PyTorch: Latest version
  • Transformers: 4.30+
  • PEFT: 0.4+
  • Python: 3.8+

Training Configuration:

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Learning Rate: 2e-4
  • Batch Size: 1 (effective: 8 with accumulation)
  • Epochs: 3
  • Gradient Checkpointing: Enabled

βš–οΈ Legal & Ethical Considerations

Important Disclaimers:

  • NOT legal advice: This model cannot provide legal guidance
  • NOT reliable: Outputs are inconsistent and often incorrect
  • NOT production-ready: Use only for research/learning
  • NOT compliant: Cannot be trusted for actual compliance work

Use Cases:

  • βœ… Research: Understanding LoRA fine-tuning
  • βœ… Learning: How NOT to train a model
  • βœ… Experimentation: Testing fine-tuning workflows
  • ❌ Production: Never use for real compliance analysis
  • ❌ Legal work: Cannot replace qualified professionals

πŸ” How to Evaluate This Model

Test Scenarios:

  1. Simple cases: Basic data collection features
  2. Complex cases: Multi-jurisdictional compliance
  3. Edge cases: Unusual software features
  4. Format consistency: Check output structure

Expected Results:

  • Consistency: 30-40% (very poor)
  • Accuracy: 20-30% (unreliable)
  • Format: 60-70% (sometimes maintains structure)
  • Usefulness: 10-20% (mostly unusable)

πŸ“ž Contact & Support

Repository Owner:

  • Username: Wildstash
  • Purpose: Learning and experimentation
  • Status: Active learner, not professional model developer

Support Level:

  • Issues: Will respond to technical questions
  • Fixes: No guarantees on model improvements
  • Updates: May attempt to fix in future
  • Production: Cannot provide production support

πŸŽ“ Educational Value

What This Model Demonstrates:

  1. LoRA fine-tuning process (successful)
  2. Importance of data quality (critical lesson)
  3. Training workflow (complete example)
  4. Common pitfalls (what to avoid)
  5. Debugging process (how to identify issues)

Learning Outcomes:

  • Technical skills: LoRA implementation
  • Data preparation: What NOT to do
  • Model evaluation: How to assess quality
  • Troubleshooting: Common fine-tuning issues

οΏ½οΏ½ Conclusion

This is a LEARNING EXPERIMENT, not a working product.

What We Accomplished:

  • βœ… Successfully implemented LoRA fine-tuning
  • βœ… Completed training workflow end-to-end
  • βœ… Learned valuable lessons about data quality
  • βœ… Demonstrated technical implementation

What We Learned:

  • ❌ Data quality is more important than quantity
  • ❌ Mixed formats confuse the model
  • ❌ Validation during training is essential
  • ❌ Output consistency requires careful design

Final Assessment:

Technical Success, Quality Failure

The LoRA training worked perfectly, but the resulting model is unreliable due to poor training data quality. This serves as an excellent example of why data preparation is crucial in machine learning.


Use this model for learning and experimentation only. Do NOT rely on it for any real-world compliance analysis or legal work. ⚠️


Last Updated: September 2024 Status: Experimental/Test Run Quality: Poor - Learning Example Only

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Wildstash/geo-compliance-gpt-oss-20b

Base model

openai/gpt-oss-20b
Adapter
(73)
this model