File size: 5,616 Bytes
6fdb987
 
6fb9270
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43719a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---

license: bsd-3-clause
datasets:
- AnishJoshi/nl2bash-custom
language:
- en
metrics:
- sacrebleu
base_model:
- Salesforce/codet5p-220m-bimodal
pipeline_tag: text2text-generation
tags:
- Python
- PyTorch
- Transformers
- english-to-bash
- nl2bash
- nl2cmd
---


# NL to Bash Translator

This model is a fine-tuned version of `codet5p-220m-bimodal` for translating natural language (NL) commands into Bash code. It simplifies command-line usage by allowing users to describe desired tasks in plain English and generates corresponding Bash commands.

## Model Overview

- **Task:** Natural Language to Bash Code Translation
- **Base Model:** codet5p-220m-bimodal
- **Training Focus:** Accurate command translation and efficient execution

## Dataset Description

The dataset used for training consists of natural language and Bash code pairs:

- **Total Samples:** 19,658
- **Training Set:** 19,658 samples
- **Validation Set:** 2,457 samples
- **Test Set:** 2,458 samples

Each sample contains:
- Natural language command (`nl_command`)
- Corresponding Bash code (`bash_code`)
- Serial number (`srno`)

## Training Setup

### Training Parameters

- **Learning Rate:** 5e-5
- **Batch Size:** 8 (training), 16 (evaluation)
- **Number of Epochs:** 5
- **Warmup Steps:** 500
- **Gradient Accumulation Steps:** 2
- **Weight Decay:** 0.01
- **Evaluation Strategy:** End of each epoch
- **Mixed Precision:** Enabled (FP16)

### Optimizer and Scheduler

- **Optimizer:** AdamW
- **Scheduler:** Linear learning rate with warmup

### Training Workflow

- Tokenization and processing to fit model input requirements
- Data Collator: `DataCollatorForSeq2Seq`
- Evaluation Metric: BLEU score

### Training Performance

| Epoch | Training Loss | Validation Loss | BLEU | Precision Scores | Brevity Penalty | Length Ratio | Translation Length | Reference Length |
|-------|---------------|-----------------|-------|----------------------------|-----------------|--------------|-------------------|------------------|
| 1 | 0.1882 | 0.1534 | 0.2751| [0.682, 0.516, 0.405, 0.335]| 0.5886 | 0.6536 | 26,316 | 40,264 |
| 2 | 0.1357 | 0.1198 | 0.3016| [0.731, 0.575, 0.470, 0.401]| 0.5684 | 0.6390 | 25,729 | 40,264 |
| 3 | 0.0932 | 0.1007 | 0.3399| [0.769, 0.629, 0.530, 0.464]| 0.5789 | 0.6465 | 26,032 | 40,264 |
| 4 | 0.0738 | 0.0889 | 0.3711| [0.795, 0.669, 0.582, 0.522]| 0.5851 | 0.6511 | 26,214 | 40,264 |
| 5 | 0.0641 | 0.0810 | 0.3939| [0.810, 0.700, 0.622, 0.566]| 0.5893 | 0.6541 | 26,336 | 40,264 |

### Test Performance

- **Test Loss:** 0.0867
- **Test BLEU Score:** 0.3699
- **Precision Scores:** [0.809, 0.692, 0.611, 0.555]
- **Brevity Penalty:** 0.5604
- **Length Ratio:** 0.6333
- **Translation Length:** 26,108
- **Reference Length:** 41,225

## Usage

### Load the Model and Tokenizer

from transformers import AutoTokenizer, AutoModel

# Option 1: Load from Hugging Face Hub
```python

model_name = "your-username/model-name" # Replace with the actual model name on Hugging Face

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModel.from_pretrained(model_name)



# Option 2: Load from local directory

# local_model_path = "path/to/your/downloaded/model" # Replace with your local path

# tokenizer = AutoTokenizer.from_pretrained(local_model_path)

# model = AutoModel.from_pretrained(local_model_path)

```
### Prepare Input

```python

import torch



device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model.to(device)

model.eval() # Set the model to evaluation mode



# Add the prefix to the input command

nl_command = "Your natural language command here"

input_text_with_prefix = f"bash: {nl_command}"



# Tokenize the input

inputs_with_prefix = tokenizer(input_text_with_prefix, return_tensors="pt", truncation=True, max_length=128).to(device)

```

### Generate Bash Code

```python

# Generate bash code

with torch.no_grad():

outputs_with_prefix = model.generate(

**inputs_with_prefix,

max_new_tokens=200,

num_return_sequences=1,

temperature=0.3,

top_p=0.95,

do_sample=True,

eos_token_id=tokenizer.eos_token_id,

)



generated_code_with_prefix = tokenizer.decode(outputs_with_prefix[0], skip_special_tokens=True)

print("Generated Bash Command:", generated_code_with_prefix)

```

## Example Outputs

Input: "bash: Enable the shell option 'cmdhist'"
Expected Output: `shopt -s cmdhist`
Generated Output: `shopt -s cmdhist`

## Language Bias and Generalization

The model exhibits some language bias, performing better when the natural language command closely matches training examples. Minor variations in output can occur based on command phrasing:

1. Original Command: "Find all files under /path/to/base/dir and change their permission to 644."
Generated Bash Code: `find /path/to/base/dir -type f -exec chmod 644 {} +`

2. Variant Command: "Modify the permissions to 644 for every file in the directory /path/to/base/dir."
Generated Bash Code: `find /path/to/base/dir -type f -exec chmod 644 {} \;`

The model generally captures the intended functionality, but minor variations in output can occur.

## Limitations and Future Work

1. **Bash Command Accuracy:** While the BLEU score and precision metrics are promising, some generated commands may still require manual refinement.
2. **Handling Complex Commands:** For highly complex tasks, the model may not always produce optimal results.
3. **Language Variation:** The model's performance might degrade if the input deviates significantly from the training data.