File size: 4,932 Bytes

---
license: cc-by-nc-4.0
datasets:
- issai/Central_Asian_Food_Dataset
language:
- en
base_model:
- microsoft/swinv2-base-patch4-window16-256
pipeline_tag: image-classification
library_name: transformers
tags:
- classification
- image
- pytorch
- safetensors
co2_eq_emissions:
  emissions: 0.054843
  source: code carbon
  training_type: fine-tuning
  geographical_location: Oregon, USA (45.5999, -121.1871)
  hardware_used: 2x Tesla T4 GPUs, Intel Xeon CPU (4 cores), 31.35 GB RAM
---
# Central Asian Food Classification

## Model Information

- **Base Model**: [microsoft/swinv2-base-patch4-window16-256](https://huggingface.co/microsoft/swinv2-base-patch4-window16-256)
- **Dataset**: [issai/Central_Asian_Food_Dataset](https://huggingface.co/datasets/issai/Central_Asian_Food_Dataset)
- **Library**: `transformers`, `pytorch`
- **Pipeline**: Image Classification
- **License**: Creative Commons Attribution Non Commercial 4.0

## Model Description
- This model classifies images of Central Asian dishes into 42 different categories. 
- The model is fine-tuned on the Central Asian Food Dataset using Swin Transformer v2 architecture.
- The training was conducted on 2 Tesla T4 GPUs in Oregon, USA.

## Labels (Classes)

```python
class_names = [
    "achichuk", "airan-katyk", "asip", "bauyrsak", "beshbarmak-w-kazy",
    "beshbarmak-wo-kazy", "chak-chak", "cheburek", "doner-lavash", "doner-nan",
    "hvorost", "irimshik", "kattama-nan", "kazy-karta", "kurt", "kuyrdak",
    "kymyz-kymyran", "lagman-fried", "lagman-w-soup", "lagman-wo-soup", "manty",
    "naryn", "nauryz-kozhe", "orama", "plov", "samsa", "shashlyk-chicken",
    "shashlyk-chicken-v", "shashlyk-kuskovoi", "shashlyk-kuskovoi-v",
    "shashlyk-minced-meat", "sheep-head", "shelpek", "shorpa", "soup-plain",
    "sushki", "suzbe", "taba-nan", "talkan-zhent", "tushpara-fried",
    "tushpara-w-soup", "tushpara-wo-soup"
]
```
## Training
```
training_args = TrainingArguments(
    output_dir="./swinv2_classification",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=5,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10
)
```
```
Epoch	Training Loss	Validation Loss
1	      0.815700	       0.741029
2	      0.454500	       0.641849
3	      0.100500	       0.680114
4	      0.030000	       0.704669
5	      0.009000	       0.661318
```
## Evaluation Metrics

The model achieved **87% accuracy** on the validation set. Below is the classification report with precision, recall, and F1-score for each class:

```
            accuracy                           0.87      2735
           macro avg       0.86      0.85      0.85      2735
        weighted avg       0.88      0.87      0.87      2735
```
![confusion matrix](matrix.png)

## Environmental Impact

The estimated carbon emissions from training this model:

- **Emissions**: 0.054843 grams CO2
- **Source**: Code Carbon
- **Training Type**: Fine-tuning
- **Location**: Oregon, USA (45.5999, -121.1871)
- **Hardware Used**: 2x Tesla T4 GPUs, Intel Xeon CPU (4 cores), 31.35 GB RAM

## Usage

To use this model for inference:

```python
import requests
from io import BytesIO
from PIL import Image
from transformers import pipeline

# Load the model
pipe = pipeline("image-classification", model="Eraly-ml/centraasia-Swinv2", device=0)

# Image URL
image_url = "https://avatars.mds.yandex.net/get-altay/12813969/2a0000018e10a3da6a2a1d1d2c2745548220/XXXL"

# Download the image from the internet
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))

# Model classes
class_names = [
    "achichuk", "airan-katyk", "asip", "bauyrsak", "beshbarmak-w-kazy",
    "beshbarmak-wo-kazy", "chak-chak", "cheburek", "doner-lavash", "doner-nan",
    "hvorost", "irimshik", "kattama-nan", "kazy-karta", "kurt", "kuyrdak",
    "kymyz-kymyran", "lagman-fried", "lagman-w-soup", "lagman-wo-soup", "manty",
    "naryn", "nauryz-kozhe", "orama", "plov", "samsa", "shashlyk-chicken",
    "shashlyk-chicken-v", "shashlyk-kuskovoi", "shashlyk-kuskovoi-v",
    "shashlyk-minced-meat", "sheep-head", "shelpek", "shorpa", "soup-plain",
    "sushki", "suzbe", "taba-nan", "talkan-zhent", "tushpara-fried",
    "tushpara-w-soup", "tushpara-wo-soup"
]

# Make a prediction
predictions = pipe(image)

# Display results with correct labels
for pred in predictions:
    label_id = int(pred["label"].replace("LABEL_", ""))  # Extract the number
    class_name = class_names[label_id]  # Get the class name
    score = pred["score"]  # Probability
    print(f"Class: {class_name}, probability: {score:.4f}")

```

## Citation

If you use this model, please cite:

```
@misc{CentralAsianFood,
  author = {Eraly Gainulla},
  title = {Central Asian Food Classification Model},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Eraly-ml/centraasia-Swinv2}
}
```