eagle0504
/

qwen-distilled-scout-1.5b-gen2

Text Generation

text-generation-inference

Model card Files Files and versions Community

eagle0504 commited on 25 days ago

Commit

540a685

·

verified ·

1 Parent(s): e8c608f

Update README.md

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -14,6 +14,8 @@ This model is a fine-tuned version of [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5
 Fine-tuning was conducted using DeepSpeed on a multi-A100 GPU setup via RunPod for efficient training in memory-constrained environments. The training dataset includes complex logical SQL queries generated synthetically with corresponding natural language prompts and CoT explanations.
 ---
 ## 🧾 Model Details
@@ -30,7 +32,12 @@ Fine-tuning was conducted using DeepSpeed on a multi-A100 GPU setup via RunPod f
 ## 🧪 Training Dataset
-**Dataset Used:** [`gretelai/synthetic_text_to_sql`](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)
 The dataset contains structured training examples of the form:
@@ -109,8 +116,8 @@ class StopOnTokens(StoppingCriteria):
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("eagle0504/enhanced-deepseek-r1-distill-qwen-1.5b-finetuned-on-gsm8k-codealpaca20k-text2sql")
-tokenizer = AutoTokenizer.from_pretrained("eagle0504/enhanced-deepseek-r1-distill-qwen-1.5b-finetuned-on-gsm8k-codealpaca20k-text2sql")
 # Example stop sequence
 stop_sequence = "</response>"

 Fine-tuning was conducted using DeepSpeed on a multi-A100 GPU setup via RunPod for efficient training in memory-constrained environments. The training dataset includes complex logical SQL queries generated synthetically with corresponding natural language prompts and CoT explanations.
+Inference notebook is publicly available [here](https://colab.research.google.com/drive/10CJqyIAOd9QnEp0W8NN_SxdiOrFsBz0-?usp=sharing).
 ---
 ## 🧾 Model Details
 ## 🧪 Training Dataset
+**Dataset Used:**
+- [`gretelai/synthetic_text_to_sql`](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)
+- [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1)
+- [`eagle0504/augmented_codealpaca-20k-using-together-ai-deepseek-v1`](https://huggingface.co/datasets/eagle0504/augmented_codealpaca-20k-using-together-ai-deepseek-v1)
 The dataset contains structured training examples of the form:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("eagle0504/qwen-distilled-scout-1.5b-gen2")
+tokenizer = AutoTokenizer.from_pretrained("eagle0504/qwen-distilled-scout-1.5b-gen2")
 # Example stop sequence
 stop_sequence = "</response>"