abhishekgahlot commited on
Commit
1cfc98a
Β·
verified Β·
1 Parent(s): f803f90

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -19
README.md CHANGED
@@ -17,21 +17,20 @@ metrics:
17
  - loss
18
  ---
19
 
20
- # Better SQL Agent - Llama 3.1 8B πŸš€
21
 
22
- ## Training Results
23
  - **Training Samples**: 19,480 (SQL analytics + technical conversations)
24
  - **Hardware**: NVIDIA 4x A10G GPU (96GB VRAM)
25
 
26
- ## πŸ“‹ Model Description
27
  This is a high-performance fine-tuned version of **Meta-Llama-3.1-8B-Instruct**, specifically optimized for:
28
- - πŸ” **SQL query generation and optimization**
29
- - πŸ“Š **Data analysis and insights**
30
- - πŸ’¬ **Technical assistance and debugging**
31
- - πŸ› οΈ **Tool-based workflows**
32
 
33
-
34
- ## πŸ”§ Training Configuration
35
  - **Base Model**: `meta-llama/Llama-3.1-8B-Instruct`
36
  - **Training Method**: LoRA (Low-Rank Adaptation)
37
  - Rank: 16, Alpha: 32, Dropout: 0.05
@@ -39,7 +38,7 @@ This is a high-performance fine-tuned version of **Meta-Llama-3.1-8B-Instruct**,
39
  - **Context Length**: 128K tokens (extended from base)
40
  - **Optimizer**: AdamW with cosine scheduling
41
 
42
- ## πŸš€ Quick Start
43
  ```python
44
  from transformers import AutoTokenizer, AutoModelForCausalLM
45
  import torch
@@ -78,7 +77,7 @@ response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:],
78
  print(response)
79
  ```
80
 
81
- ## πŸ“Š Performance Metrics
82
  | Metric | Value |
83
  |--------|-------|
84
  | **Starting Loss** | 1.53 |
@@ -86,35 +85,34 @@ print(response)
86
  | **Loss Reduction** | **96.7%** |
87
  | **Training Time** | 8.9 hours |
88
 
89
- ## 🎯 Use Cases
90
  - **SQL Generation**: Create complex queries from natural language
91
  - **Data Analysis**: Generate insights and analytical queries
92
  - **Code Assistance**: Debug and optimize SQL code
93
  - **Technical Support**: Answer database and analytics questions
94
  - **Learning Aid**: Explain SQL concepts and best practices
95
 
96
- ## πŸ” Training Data
97
  The model was trained on a curated dataset of **19,480 high-quality examples** including:
98
  - SQL query generation tasks
99
  - Data analysis conversations
100
  - Technical problem-solving dialogues
101
  - Tool usage patterns and workflows
102
 
103
- ## ⚑ Optimization Features
104
- - **Unsloth Integration**: 2x faster training and inference
105
  - **4-bit Quantization**: Reduced memory footprint
106
  - **Flash Attention**: Optimized attention mechanism
107
  - **Mixed Precision**: BF16 training for efficiency
108
 
109
- ## πŸ“„ License
110
  This model inherits the **Llama 3.1 license** from the base model. Please review the [official license](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) for usage terms.
111
 
112
- ## πŸ™ Acknowledgments
113
  - Based on Meta's Llama 3.1 8B Instruct model
114
 
115
- ## πŸ“ˆ Model Card Contact
116
  For questions about this model, please open an issue in the repository or contact the model author.
117
 
118
  ---
119
 
120
- **🌟 Achieved 96.7% loss reduction - A testament to high-quality training data and optimization!**
 
17
  - loss
18
  ---
19
 
20
+ # Better SQL Agent - Llama 3.1 8B
21
 
22
+ ## Training Results
23
  - **Training Samples**: 19,480 (SQL analytics + technical conversations)
24
  - **Hardware**: NVIDIA 4x A10G GPU (96GB VRAM)
25
 
26
+ ## Model Description
27
  This is a high-performance fine-tuned version of **Meta-Llama-3.1-8B-Instruct**, specifically optimized for:
28
+ - **SQL query generation and optimization**
29
+ - **Data analysis and insights**
30
+ - **Technical assistance and debugging**
31
+ - **Tool-based workflows**
32
 
33
+ ## Training Configuration
 
34
  - **Base Model**: `meta-llama/Llama-3.1-8B-Instruct`
35
  - **Training Method**: LoRA (Low-Rank Adaptation)
36
  - Rank: 16, Alpha: 32, Dropout: 0.05
 
38
  - **Context Length**: 128K tokens (extended from base)
39
  - **Optimizer**: AdamW with cosine scheduling
40
 
41
+ ## Quick Start
42
  ```python
43
  from transformers import AutoTokenizer, AutoModelForCausalLM
44
  import torch
 
77
  print(response)
78
  ```
79
 
80
+ ## Performance Metrics
81
  | Metric | Value |
82
  |--------|-------|
83
  | **Starting Loss** | 1.53 |
 
85
  | **Loss Reduction** | **96.7%** |
86
  | **Training Time** | 8.9 hours |
87
 
88
+ ## Use Cases
89
  - **SQL Generation**: Create complex queries from natural language
90
  - **Data Analysis**: Generate insights and analytical queries
91
  - **Code Assistance**: Debug and optimize SQL code
92
  - **Technical Support**: Answer database and analytics questions
93
  - **Learning Aid**: Explain SQL concepts and best practices
94
 
95
+ ## Training Data
96
  The model was trained on a curated dataset of **19,480 high-quality examples** including:
97
  - SQL query generation tasks
98
  - Data analysis conversations
99
  - Technical problem-solving dialogues
100
  - Tool usage patterns and workflows
101
 
102
+ ## Optimization Features
 
103
  - **4-bit Quantization**: Reduced memory footprint
104
  - **Flash Attention**: Optimized attention mechanism
105
  - **Mixed Precision**: BF16 training for efficiency
106
 
107
+ ## License
108
  This model inherits the **Llama 3.1 license** from the base model. Please review the [official license](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) for usage terms.
109
 
110
+ ## Acknowledgments
111
  - Based on Meta's Llama 3.1 8B Instruct model
112
 
113
+ ## Model Card Contact
114
  For questions about this model, please open an issue in the repository or contact the model author.
115
 
116
  ---
117
 
118
+ **Achieved 96.7% loss reduction - A testament to high-quality training data and optimization!**