File size: 4,932 Bytes
10ccd32
c019691
10ccd32
 
 
 
 
 
 
 
 
 
 
 
 
82f84d5
c019691
 
 
 
 
8e36c55
 
 
 
 
 
 
 
 
74e4e2b
8e36c55
 
d82c5f7
 
 
8e36c55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d778c3e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8e36c55
 
 
 
 
 
 
 
 
d6af5d3
8e36c55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0678b0c
 
 
8e36c55
 
0678b0c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8e36c55
 
 
 
 
 
 
 
 
601f286
8e36c55
 
 
601f286
8e36c55
c019691
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
---
license: cc-by-nc-4.0
datasets:
- issai/Central_Asian_Food_Dataset
language:
- en
base_model:
- microsoft/swinv2-base-patch4-window16-256
pipeline_tag: image-classification
library_name: transformers
tags:
- classification
- image
- pytorch
- safetensors
co2_eq_emissions:
  emissions: 0.054843
  source: code carbon
  training_type: fine-tuning
  geographical_location: Oregon, USA (45.5999, -121.1871)
  hardware_used: 2x Tesla T4 GPUs, Intel Xeon CPU (4 cores), 31.35 GB RAM
---
# Central Asian Food Classification

## Model Information

- **Base Model**: [microsoft/swinv2-base-patch4-window16-256](https://huggingface.co/microsoft/swinv2-base-patch4-window16-256)
- **Dataset**: [issai/Central_Asian_Food_Dataset](https://huggingface.co/datasets/issai/Central_Asian_Food_Dataset)
- **Library**: `transformers`, `pytorch`
- **Pipeline**: Image Classification
- **License**: Creative Commons Attribution Non Commercial 4.0

## Model Description
- This model classifies images of Central Asian dishes into 42 different categories. 
- The model is fine-tuned on the Central Asian Food Dataset using Swin Transformer v2 architecture.
- The training was conducted on 2 Tesla T4 GPUs in Oregon, USA.

## Labels (Classes)

```python
class_names = [
    "achichuk", "airan-katyk", "asip", "bauyrsak", "beshbarmak-w-kazy",
    "beshbarmak-wo-kazy", "chak-chak", "cheburek", "doner-lavash", "doner-nan",
    "hvorost", "irimshik", "kattama-nan", "kazy-karta", "kurt", "kuyrdak",
    "kymyz-kymyran", "lagman-fried", "lagman-w-soup", "lagman-wo-soup", "manty",
    "naryn", "nauryz-kozhe", "orama", "plov", "samsa", "shashlyk-chicken",
    "shashlyk-chicken-v", "shashlyk-kuskovoi", "shashlyk-kuskovoi-v",
    "shashlyk-minced-meat", "sheep-head", "shelpek", "shorpa", "soup-plain",
    "sushki", "suzbe", "taba-nan", "talkan-zhent", "tushpara-fried",
    "tushpara-w-soup", "tushpara-wo-soup"
]
```
## Training
```
training_args = TrainingArguments(
    output_dir="./swinv2_classification",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=5,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10
)
```
```
Epoch	Training Loss	Validation Loss
1	      0.815700	       0.741029
2	      0.454500	       0.641849
3	      0.100500	       0.680114
4	      0.030000	       0.704669
5	      0.009000	       0.661318
```
## Evaluation Metrics

The model achieved **87% accuracy** on the validation set. Below is the classification report with precision, recall, and F1-score for each class:

```
            accuracy                           0.87      2735
           macro avg       0.86      0.85      0.85      2735
        weighted avg       0.88      0.87      0.87      2735
```
![confusion matrix](matrix.png)

## Environmental Impact

The estimated carbon emissions from training this model:

- **Emissions**: 0.054843 grams CO2
- **Source**: Code Carbon
- **Training Type**: Fine-tuning
- **Location**: Oregon, USA (45.5999, -121.1871)
- **Hardware Used**: 2x Tesla T4 GPUs, Intel Xeon CPU (4 cores), 31.35 GB RAM

## Usage

To use this model for inference:

```python
import requests
from io import BytesIO
from PIL import Image
from transformers import pipeline

# Load the model
pipe = pipeline("image-classification", model="Eraly-ml/centraasia-Swinv2", device=0)

# Image URL
image_url = "https://avatars.mds.yandex.net/get-altay/12813969/2a0000018e10a3da6a2a1d1d2c2745548220/XXXL"

# Download the image from the internet
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))

# Model classes
class_names = [
    "achichuk", "airan-katyk", "asip", "bauyrsak", "beshbarmak-w-kazy",
    "beshbarmak-wo-kazy", "chak-chak", "cheburek", "doner-lavash", "doner-nan",
    "hvorost", "irimshik", "kattama-nan", "kazy-karta", "kurt", "kuyrdak",
    "kymyz-kymyran", "lagman-fried", "lagman-w-soup", "lagman-wo-soup", "manty",
    "naryn", "nauryz-kozhe", "orama", "plov", "samsa", "shashlyk-chicken",
    "shashlyk-chicken-v", "shashlyk-kuskovoi", "shashlyk-kuskovoi-v",
    "shashlyk-minced-meat", "sheep-head", "shelpek", "shorpa", "soup-plain",
    "sushki", "suzbe", "taba-nan", "talkan-zhent", "tushpara-fried",
    "tushpara-w-soup", "tushpara-wo-soup"
]

# Make a prediction
predictions = pipe(image)

# Display results with correct labels
for pred in predictions:
    label_id = int(pred["label"].replace("LABEL_", ""))  # Extract the number
    class_name = class_names[label_id]  # Get the class name
    score = pred["score"]  # Probability
    print(f"Class: {class_name}, probability: {score:.4f}")

```

## Citation

If you use this model, please cite:

```
@misc{CentralAsianFood,
  author = {Eraly Gainulla},
  title = {Central Asian Food Classification Model},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Eraly-ml/centraasia-Swinv2}
}
```