eagle0504 commited on
Commit
50fcbe3
·
verified ·
1 Parent(s): fbdcdf3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -49
README.md CHANGED
@@ -5,16 +5,16 @@ datasets:
5
  - eagle0504/warren-buffett-letters-qna-r1-enhanced-1998-2024
6
  language:
7
  - en
8
- new_version: unsloth/Llama-3.2-1B-Instruct
9
  pipeline_tag: question-answering
10
  ---
11
 
12
 
13
- # Model Card for OpenAI GSM8K Dataset Enhanced with Reasoning
14
 
15
- This model is fine-tuned to answer questions based on the OpenAI GSM8K dataset enhanced with reasoning provided from Deepseek R1.
16
 
17
- Invoke notebook shared [here](https://colab.research.google.com/drive/1B_Fbz0w76QxHbo9zAOf_pyZKKNI0EJJ9?usp=sharing), a publicly available Colab notebook for tests.
18
 
19
  ---
20
 
@@ -22,81 +22,115 @@ Invoke notebook shared [here](https://colab.research.google.com/drive/1B_Fbz0w76
22
 
23
  ### Model Description
24
 
25
- This is a transformer-based question-answering model fine-tuned from `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`. It was trained on a dataset derived from the OpenAI GSM8K benchmark, enhanced with chain-of-thought reasoning to encourage intermediate logical steps. The dataset pairs math word problems with structured answers, using `<think>...</think>` and `<answer>...</answer>` tags.
26
 
27
  - **Developed by:** Yiqiao Yin
28
- - **Model type:** Causal Language Model (fine-tuned for Q&A with reasoning)
29
  - **Language(s):** English
30
  - **License:** MIT
31
- - **Finetuned from model:** deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
32
 
33
  ---
34
 
35
- ## Training Configuration
36
 
37
- - 🖥️ **Hardware:** Trained on a RunPod instance with:
38
- - 🔥 6 × NVIDIA H100 PCIe GPUs
39
- - 🧠 144 vCPUs
40
- - 🧮 1132 GB system RAM
41
- - 💽 20 GB disk per GPU
42
- - 🐳 **Container Image:** `runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04`
43
- - ⏱️ **Total Training Time:** 2 hours
44
- - 💸 **Cost:** ~$14/hour × 2 hours = **$28 USD**
45
- - ⚙️ **Zero Redundancy Optimization:** DeepSpeed Stage 2
46
- - 🎯 **Precision:** FP16 mixed-precision training
47
 
48
- ---
 
 
 
49
 
50
- ## Performance
51
 
52
- - **Mean token-level accuracy:** **97%**
53
- - Evaluation based on in-training token match accuracy over the formatted `<think>...</think><answer>...</answer>` structure.
54
- - Model demonstrates strong reasoning capability in multi-step arithmetic and logic problems.
55
 
56
  ---
57
 
58
- ## Inference Format
59
 
60
- To generate accurate completions, prompt the model in the following structure:
61
 
62
- ```
63
 
64
- Question: If Sally has 3 apples and buys 2 more, how many does she have in total? <think>
65
 
66
- ```
67
 
68
- The model will continue reasoning within `<think>...</think>` and provide a final answer inside `<answer>...</answer>`.
69
 
70
- ---
 
71
 
72
- ## Intended Use
 
73
 
74
- This model is intended for educational and research purposes in:
75
- - Chain-of-thought prompting
76
- - Math reasoning and logical inference
77
- - Question-answering with intermediate steps
78
 
79
  ---
80
 
81
- ## Limitations
82
 
83
- - Trained on structured synthetic data — real-world generalization may vary
84
- - Best performance achieved when following the exact inference format
85
- - Does not support multilingual inputs
86
 
87
- ---
 
88
 
89
- ## Citation
90
 
91
- If you use this model, please cite:
92
 
 
 
 
 
 
93
  ```
94
 
95
- @misc{yin2024gsm8k,
96
- author = {Yiqiao Yin},
97
- title = {TBD},
98
- year = 2025,
99
- note = {TBD}
100
- }
101
 
102
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - eagle0504/warren-buffett-letters-qna-r1-enhanced-1998-2024
6
  language:
7
  - en
8
+ new_version: unsloth/Llama-3.2-3B-Instruct
9
  pipeline_tag: question-answering
10
  ---
11
 
12
 
13
+ # Model Card for warren-buffett-letters-qna-r1-enhanced-1998-2024-finetuned-llama-3.2-3B-Instruct
14
 
15
+ This model is fine-tuned to answer questions based on Warren Buffett’s annual shareholder letters from 1998 to 2024. It understands the themes, vocabulary, and tone of Buffett’s writing and is capable of responding to questions about his investment philosophy, decisions, and observations.
16
 
17
+ Invoke notebook shared [here](https://colab.research.google.com/drive/3B_Fbz0w76QxHbo9zAOf_pyZKKNI0EJJ9?usp=sharing), a publicly available Colab notebook for tests.
18
 
19
  ---
20
 
 
22
 
23
  ### Model Description
24
 
25
+ This is a transformer-based question-answering model fine-tuned from `unsloth/Llama-3.2-3B-Instruct`. It was trained on a dataset derived from Warren Buffett’s letters to Berkshire Hathaway shareholders. The dataset pairs real excerpts with corresponding questions and answers for a conversational learning experience.
26
 
27
  - **Developed by:** Yiqiao Yin
28
+ - **Model type:** Causal Language Model (fine-tuned for Q&A)
29
  - **Language(s):** English
30
  - **License:** MIT
31
+ - **Finetuned from model:** unsloth/Llama-3.2-3B-Instruct
32
 
33
  ---
34
 
35
+ ## Uses
36
 
37
+ ### Direct Use
 
 
 
 
 
 
 
 
 
38
 
39
+ This model can be used to:
40
+ - Ask questions about specific themes or time periods in Warren Buffett’s letters
41
+ - Learn about value investing and Buffett’s decision-making
42
+ - Generate educational content based on his financial wisdom
43
 
44
+ ### Out-of-Scope Use
45
 
46
+ - This model is not suited for general-purpose financial advice.
47
+ - It may not generalize well outside the context of Buffett’s letters.
 
48
 
49
  ---
50
 
51
+ ## Bias, Risks, and Limitations
52
 
53
+ The model inherits the biases and perspectives from Warren Buffett’s letters, which reflect his personal views and investment philosophy. While these views are valuable, they do not represent all schools of financial thought. Also, since the model was fine-tuned on a niche dataset, it may not perform well on unrelated questions or general knowledge.
54
 
55
+ ### Recommendations
56
 
57
+ Always verify model outputs, especially if using for educational or advisory purposes.
58
 
59
+ ---
60
 
61
+ ## How to Get Started with the Model
62
 
63
+ ```python
64
+ from transformers import AutoModelForCausalLM, AutoTokenizer
65
 
66
+ model = AutoModelForCausalLM.from_pretrained("eagle0504/warren-buffett-letters-qna-r1-enhanced-1998-2024-finetuned-llama-3.2-3B-Instruct")
67
+ tokenizer = AutoTokenizer.from_pretrained("eagle0504/warren-buffett-letters-qna-r1-enhanced-1998-2024-finetuned-llama-3.2-3B-Instruct")
68
 
69
+ inputs = tokenizer("Question: What is intrinsic value?\nAnswer:", return_tensors="pt")
70
+ outputs = model.generate(**inputs)
71
+ print(tokenizer.decode(outputs[0]))
72
+ ````
73
 
74
  ---
75
 
76
+ ## Training Details
77
 
78
+ ### Training Data
 
 
79
 
80
+ * Dataset: [eagle0504/warren-buffett-letters-qna-r1-enhanced-1998-2024](https://huggingface.co/datasets/eagle0504/warren-buffett-letters-qna-r1-enhanced-1998-2024)
81
+ * Format: `{"question": "...", "answer": "..."}` based on text from the letters
82
 
83
+ ### Training Procedure
84
 
85
+ #### Preprocessing
86
 
87
+ A formatting function was used to convert each entry to:
88
+
89
+ ```
90
+ Question: <question text>
91
+ Answer: <answer text>
92
  ```
93
 
94
+ #### Training Hyperparameters
 
 
 
 
 
95
 
96
+ * Epochs: 50
97
+ * Batch Size: 8
98
+ * Learning Rate: 2e-5
99
+ * Gradient Accumulation: 1
100
+ * Mixed Precision: No (fp32)
101
+ * Framework: 🤗 Transformers + TRL + DeepSpeed
102
+
103
+ #### Final Training Metrics
104
+
105
+ * **Loss:** 0.0532
106
+ * **Gradient Norm:** 0.2451
107
+ * **Learning Rate:** 9.70e-08
108
+ * **Mean Token Accuracy:** 98.12%
109
+ * **Final Epoch:** 49.76
110
+
111
+ #### Compute Infrastructure
112
+
113
+ * **Hardware:** 4× NVIDIA A100, 38 vCPUs, 200 GB RAM
114
+ * **Cloud Provider:** Runpod
115
+ * **Docker Image:** `runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04`
116
+ * **Package Manager:** `uv`
117
+
118
+ ---
119
+
120
+ ## Environmental Impact
121
+
122
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
123
+
124
+ * **Hardware Type:** 4× NVIDIA RTX A100 GPUs
125
+ * **Hours used:** 20 hours
126
+ * **Cloud Provider:** Runpod
127
+ * **Compute Region:** Unknown
128
+ * **Training Cost:** \$6.56/hour → **Total: \$130+**
129
+ * **Carbon Emitted:** Not formally calculated
130
+
131
+ ---
132
+
133
+ ## Model Card Contact
134
+
135
+ Author: Yiqiao Yin
136
+ Connect with me on [LinkedIn](https://www.linkedin.com/in/yiqiaoyin/)