SIB200 CDA Model with Llama
	
This model was trained on the SIB200 dataset using Counterfactual Data Augmentation (CDA) with counterfactuals generated by Llama.
	
		
	
	
		Training Parameters
	
- Dataset: SIB200
 
- Mode: CDA
 
- Selection Model: Llama
 
- Selection Method: Random
 
- Train Size: 700 examples
 
- Epochs: 20
 
- Batch Size: 8
 
- Effective Batch Size: 32 (batch_size * gradient_accumulation_steps)
 
- Learning Rate: 8e-06
 
- Patience: 8
 
- Max Length: 192
 
- Gradient Accumulation Steps: 4
 
- Warmup Ratio: 0.1
 
- Weight Decay: 0.01
 
- Optimizer: AdamW
 
- Scheduler: cosine_with_warmup
 
- Random Seed: 42
 
	
		
	
	
		Performance
	
- Overall Accuracy: 78.45%
 
- Overall Loss: 0.0194
 
	
		
	
	
		Language-Specific Performance
	
- English (EN): 83.84%
 
- German (DE): 87.88%
 
- Arabic (AR): 56.57%
 
- Spanish (ES): 88.89%
 
- Hindi (HI): 79.80%
 
- Swahili (SW): 73.74%
 
	
		
	
	
		Model Information
	
- Base Model: bert-base-multilingual-cased
 
- Task: Topic Classification
 
- Languages: 6 languages (EN, DE, AR, ES, HI, SW)