eagle0504 commited on
Commit
540a685
·
verified ·
1 Parent(s): e8c608f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -3
README.md CHANGED
@@ -14,6 +14,8 @@ This model is a fine-tuned version of [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5
14
 
15
  Fine-tuning was conducted using DeepSpeed on a multi-A100 GPU setup via RunPod for efficient training in memory-constrained environments. The training dataset includes complex logical SQL queries generated synthetically with corresponding natural language prompts and CoT explanations.
16
 
 
 
17
  ---
18
 
19
  ## 🧾 Model Details
@@ -30,7 +32,12 @@ Fine-tuning was conducted using DeepSpeed on a multi-A100 GPU setup via RunPod f
30
 
31
  ## 🧪 Training Dataset
32
 
33
- **Dataset Used:** [`gretelai/synthetic_text_to_sql`](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)
 
 
 
 
 
34
 
35
  The dataset contains structured training examples of the form:
36
 
@@ -109,8 +116,8 @@ class StopOnTokens(StoppingCriteria):
109
  ```python
110
  from transformers import AutoModelForCausalLM, AutoTokenizer
111
 
112
- model = AutoModelForCausalLM.from_pretrained("eagle0504/enhanced-deepseek-r1-distill-qwen-1.5b-finetuned-on-gsm8k-codealpaca20k-text2sql")
113
- tokenizer = AutoTokenizer.from_pretrained("eagle0504/enhanced-deepseek-r1-distill-qwen-1.5b-finetuned-on-gsm8k-codealpaca20k-text2sql")
114
 
115
  # Example stop sequence
116
  stop_sequence = "</response>"
 
14
 
15
  Fine-tuning was conducted using DeepSpeed on a multi-A100 GPU setup via RunPod for efficient training in memory-constrained environments. The training dataset includes complex logical SQL queries generated synthetically with corresponding natural language prompts and CoT explanations.
16
 
17
+ Inference notebook is publicly available [here](https://colab.research.google.com/drive/10CJqyIAOd9QnEp0W8NN_SxdiOrFsBz0-?usp=sharing).
18
+
19
  ---
20
 
21
  ## 🧾 Model Details
 
32
 
33
  ## 🧪 Training Dataset
34
 
35
+ **Dataset Used:**
36
+
37
+
38
+ - [`gretelai/synthetic_text_to_sql`](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)
39
+ - [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1)
40
+ - [`eagle0504/augmented_codealpaca-20k-using-together-ai-deepseek-v1`](https://huggingface.co/datasets/eagle0504/augmented_codealpaca-20k-using-together-ai-deepseek-v1)
41
 
42
  The dataset contains structured training examples of the form:
43
 
 
116
  ```python
117
  from transformers import AutoModelForCausalLM, AutoTokenizer
118
 
119
+ model = AutoModelForCausalLM.from_pretrained("eagle0504/qwen-distilled-scout-1.5b-gen2")
120
+ tokenizer = AutoTokenizer.from_pretrained("eagle0504/qwen-distilled-scout-1.5b-gen2")
121
 
122
  # Example stop sequence
123
  stop_sequence = "</response>"