feat: upload model files
Browse files- README.md +30 -61
- config.json +5 -2
- model-00001-of-00002.safetensors +1 -1
- model-00002-of-00002.safetensors +1 -1
- special_tokens_map.json +1 -7
- tokenizer_config.json +1 -3
README.md
CHANGED
@@ -4,33 +4,35 @@ language:
|
|
4 |
- en
|
5 |
base_model:
|
6 |
- meta-llama/Llama-3.2-3B-Instruct
|
7 |
-
pipeline_tag: text-
|
8 |
tags:
|
9 |
- Speech Recognition
|
10 |
- ATC
|
|
|
|
|
11 |
---
|
12 |
|
13 |
-
# ATC Communication Expert Model
|
14 |
|
15 |
-
A fine-tuned model specialized in improving and analyzing Air Traffic Control (ATC) communications,
|
16 |
|
17 |
## Model Details
|
18 |
|
19 |
### Model Description
|
20 |
|
21 |
-
This model is a fine-tuned version of Llama-3.2-3B-Instruct optimized for processing Air Traffic Control communications. It can:
|
22 |
|
23 |
- Improve raw ATC transcripts with proper punctuation and formatting
|
24 |
- Identify communication intentions (pilot requests, ATC instructions, etc.)
|
25 |
- Extract key information such as flight numbers, altitudes, headings, and other numerical data
|
26 |
- Analyze speaker roles and communication patterns
|
27 |
|
28 |
-
The model was
|
29 |
|
30 |
- **Developed by:** ATC NLP Team
|
31 |
-
- **Model type:**
|
32 |
- **Language(s):** English, specialized for ATC terminology
|
33 |
-
- **License:** Same as the base model
|
34 |
- **Finetuned from model:** meta-llama/Llama-3.2-3B-Instruct
|
35 |
|
36 |
## Uses
|
@@ -80,21 +82,15 @@ This model is not suitable for:
|
|
80 |
## How to Get Started with the Model
|
81 |
|
82 |
```python
|
83 |
-
from peft import PeftModel
|
84 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
85 |
|
86 |
-
# Load the model
|
87 |
-
|
88 |
-
"
|
89 |
torch_dtype="auto",
|
90 |
device_map="auto"
|
91 |
)
|
92 |
-
tokenizer = AutoTokenizer.from_pretrained("
|
93 |
-
model = PeftModel.from_pretrained(base_model, "path_to_adapters")
|
94 |
-
|
95 |
-
# Alternatively, use the merged model if available
|
96 |
-
# model = AutoModelForCausalLM.from_pretrained("path_to_merged_model")
|
97 |
-
# tokenizer = AutoTokenizer.from_pretrained("path_to_merged_model")
|
98 |
|
99 |
# Process an ATC message
|
100 |
instruction = "As an ATC communication expert, improve this transcript and analyze its intentions and data."
|
@@ -109,45 +105,27 @@ response = tokenizer.decode(outputs[0, inputs["input_ids"].shape[1]:], skip_spec
|
|
109 |
print(response)
|
110 |
```
|
111 |
|
112 |
-
##
|
113 |
-
|
114 |
-
### Training Data
|
115 |
|
116 |
-
|
117 |
-
- Original raw transcripts
|
118 |
-
- Properly punctuated and formatted versions
|
119 |
-
- Annotated intentions (PSC, PSR, PRP, PRQ, PRB, PAC, ASC, AGI, ACR, END)
|
120 |
-
- Extracted numerical data (altitudes, headings, flight numbers, etc.)
|
121 |
-
- Speaker and listener information
|
122 |
|
123 |
-
|
124 |
-
|
125 |
-
|
126 |
-
- Parameter-efficient fine-tuning using PEFT
|
127 |
-
- LoRA applied to key attention layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
|
128 |
-
- Optimized with Unsloth for efficiency
|
129 |
|
130 |
-
|
131 |
|
132 |
-
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
|
137 |
-
- **Gradient accumulation steps:** 4
|
138 |
-
- **Epochs:** 3
|
139 |
-
- **Warmup ratio:** 0.03
|
140 |
-
- **Max sequence length:** 2048
|
141 |
-
- **Training regime:** BF16 mixed precision where available, FP16 otherwise
|
142 |
-
- **Optimizer:** AdamW 8-bit
|
143 |
|
144 |
## Evaluation
|
145 |
|
146 |
-
### Testing
|
147 |
-
|
148 |
-
#### Testing Data
|
149 |
|
150 |
-
The model
|
151 |
- Clearances and instructions
|
152 |
- Pilot requests and reports
|
153 |
- Emergency communications
|
@@ -157,19 +135,10 @@ The model was tested on a diverse set of ATC communications, including:
|
|
157 |
|
158 |
### Model Architecture and Objective
|
159 |
|
160 |
-
- **Base architecture:** Llama-3.2-3B-Instruct
|
161 |
-
- **
|
162 |
-
- **Optimization library:** Unsloth
|
163 |
- **Training objective:** Improving and analyzing ATC communications
|
164 |
|
165 |
-
###
|
166 |
-
|
167 |
-
- **Framework versions:**
|
168 |
-
- PEFT 0.15.2
|
169 |
-
- Unsloth (latest version used during training)
|
170 |
-
- Transformers (compatible with the base model)
|
171 |
-
- PyTorch (with BF16 support where available)
|
172 |
-
|
173 |
-
## Model Card Contact
|
174 |
|
175 |
-
For issues or questions about this model, please open
|
|
|
4 |
- en
|
5 |
base_model:
|
6 |
- meta-llama/Llama-3.2-3B-Instruct
|
7 |
+
pipeline_tag: text-generation
|
8 |
tags:
|
9 |
- Speech Recognition
|
10 |
- ATC
|
11 |
+
- Unsloth
|
12 |
+
- LoRA-Merged
|
13 |
---
|
14 |
|
15 |
+
# ATC Communication Expert Model (Merged)
|
16 |
|
17 |
+
A fine-tuned model specialized in improving and analyzing Air Traffic Control (ATC) communications, with LoRA adapters merged into the base model.
|
18 |
|
19 |
## Model Details
|
20 |
|
21 |
### Model Description
|
22 |
|
23 |
+
This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct with merged LoRA adapters, optimized for processing Air Traffic Control communications. It can:
|
24 |
|
25 |
- Improve raw ATC transcripts with proper punctuation and formatting
|
26 |
- Identify communication intentions (pilot requests, ATC instructions, etc.)
|
27 |
- Extract key information such as flight numbers, altitudes, headings, and other numerical data
|
28 |
- Analyze speaker roles and communication patterns
|
29 |
|
30 |
+
The model was created by merging LoRA adapters (fine-tuned on ATC communications) into the Llama 3B base model, creating a unified model optimized for this specialized domain.
|
31 |
|
32 |
- **Developed by:** ATC NLP Team
|
33 |
+
- **Model type:** Llama 3B with merged LoRA adapters
|
34 |
- **Language(s):** English, specialized for ATC terminology
|
35 |
+
- **License:** Same as the base model
|
36 |
- **Finetuned from model:** meta-llama/Llama-3.2-3B-Instruct
|
37 |
|
38 |
## Uses
|
|
|
82 |
## How to Get Started with the Model
|
83 |
|
84 |
```python
|
|
|
85 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
86 |
|
87 |
+
# Load the model and tokenizer
|
88 |
+
model = AutoModelForCausalLM.from_pretrained(
|
89 |
+
"atc_llama_merged",
|
90 |
torch_dtype="auto",
|
91 |
device_map="auto"
|
92 |
)
|
93 |
+
tokenizer = AutoTokenizer.from_pretrained("atc_llama_merged")
|
|
|
|
|
|
|
|
|
|
|
94 |
|
95 |
# Process an ATC message
|
96 |
instruction = "As an ATC communication expert, improve this transcript and analyze its intentions and data."
|
|
|
105 |
print(response)
|
106 |
```
|
107 |
|
108 |
+
## Model Creation Process
|
|
|
|
|
109 |
|
110 |
+
### Base Model and Adapters
|
|
|
|
|
|
|
|
|
|
|
111 |
|
112 |
+
- **Base model:** meta-llama/Llama-3.2-3B-Instruct
|
113 |
+
- **Adapter source:** LoRA adapters fine-tuned on ATC communications data
|
114 |
+
- **Merge method:** PEFT adapter merging into base model weights
|
|
|
|
|
|
|
115 |
|
116 |
+
### Merging Procedure
|
117 |
|
118 |
+
The model creation involved:
|
119 |
+
1. Loading the base Llama 3B model
|
120 |
+
2. Loading LoRA adapters fine-tuned on ATC communications data
|
121 |
+
3. Merging the adapters into the base model's weights
|
122 |
+
4. Saving the resulting unified model
|
|
|
|
|
|
|
|
|
|
|
|
|
123 |
|
124 |
## Evaluation
|
125 |
|
126 |
+
### Testing
|
|
|
|
|
127 |
|
128 |
+
The model should be tested on diverse ATC communications, including:
|
129 |
- Clearances and instructions
|
130 |
- Pilot requests and reports
|
131 |
- Emergency communications
|
|
|
135 |
|
136 |
### Model Architecture and Objective
|
137 |
|
138 |
+
- **Base architecture:** meta-llama/Llama-3.2-3B-Instruct
|
139 |
+
- **Adaptation method:** LoRA adapters merged into base weights
|
|
|
140 |
- **Training objective:** Improving and analyzing ATC communications
|
141 |
|
142 |
+
### Model Card Contact
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
143 |
|
144 |
+
For issues or questions about this model, please open a discussion in the repository.
|
config.json
CHANGED
@@ -5,7 +5,11 @@
|
|
5 |
"attention_bias": false,
|
6 |
"attention_dropout": 0.0,
|
7 |
"bos_token_id": 128000,
|
8 |
-
"eos_token_id":
|
|
|
|
|
|
|
|
|
9 |
"head_dim": 128,
|
10 |
"hidden_act": "silu",
|
11 |
"hidden_size": 3072,
|
@@ -31,7 +35,6 @@
|
|
31 |
"tie_word_embeddings": true,
|
32 |
"torch_dtype": "bfloat16",
|
33 |
"transformers_version": "4.51.3",
|
34 |
-
"unsloth_fixed": true,
|
35 |
"unsloth_version": "2025.3.19",
|
36 |
"use_cache": true,
|
37 |
"vocab_size": 128256
|
|
|
5 |
"attention_bias": false,
|
6 |
"attention_dropout": 0.0,
|
7 |
"bos_token_id": 128000,
|
8 |
+
"eos_token_id": [
|
9 |
+
128001,
|
10 |
+
128008,
|
11 |
+
128009
|
12 |
+
],
|
13 |
"head_dim": 128,
|
14 |
"hidden_act": "silu",
|
15 |
"hidden_size": 3072,
|
|
|
35 |
"tie_word_embeddings": true,
|
36 |
"torch_dtype": "bfloat16",
|
37 |
"transformers_version": "4.51.3",
|
|
|
38 |
"unsloth_version": "2025.3.19",
|
39 |
"use_cache": true,
|
40 |
"vocab_size": 128256
|
model-00001-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4965799096
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1cee95c58786333e8e640f19307d0cfebc5d5ff7894a65954b6dfd6bb13c4efc
|
3 |
size 4965799096
|
model-00002-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1459729952
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fc6b47057cbb231d759b93d77e6d392e96c21acf3a51aeff2f72dc497f3413bf
|
3 |
size 1459729952
|
special_tokens_map.json
CHANGED
@@ -13,11 +13,5 @@
|
|
13 |
"rstrip": false,
|
14 |
"single_word": false
|
15 |
},
|
16 |
-
"pad_token":
|
17 |
-
"content": "<|finetune_right_pad_id|>",
|
18 |
-
"lstrip": false,
|
19 |
-
"normalized": false,
|
20 |
-
"rstrip": false,
|
21 |
-
"single_word": false
|
22 |
-
}
|
23 |
}
|
|
|
13 |
"rstrip": false,
|
14 |
"single_word": false
|
15 |
},
|
16 |
+
"pad_token": "<|finetune_right_pad_id|>"
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
}
|
tokenizer_config.json
CHANGED
@@ -1,5 +1,4 @@
|
|
1 |
{
|
2 |
-
"add_bos_token": true,
|
3 |
"added_tokens_decoder": {
|
4 |
"128000": {
|
5 |
"content": "<|begin_of_text|>",
|
@@ -2062,6 +2061,5 @@
|
|
2062 |
"model_max_length": 131072,
|
2063 |
"pad_token": "<|finetune_right_pad_id|>",
|
2064 |
"padding_side": "left",
|
2065 |
-
"tokenizer_class": "PreTrainedTokenizer"
|
2066 |
-
"unk_token": null
|
2067 |
}
|
|
|
1 |
{
|
|
|
2 |
"added_tokens_decoder": {
|
3 |
"128000": {
|
4 |
"content": "<|begin_of_text|>",
|
|
|
2061 |
"model_max_length": 131072,
|
2062 |
"pad_token": "<|finetune_right_pad_id|>",
|
2063 |
"padding_side": "left",
|
2064 |
+
"tokenizer_class": "PreTrainedTokenizer"
|
|
|
2065 |
}
|