GGUF
English
llama
gaming
minecraft
mindcraft
conversational
Sweaterdog commited on
Commit
b45aac5
·
verified ·
1 Parent(s): 668c9db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +161 -2
README.md CHANGED
@@ -1,6 +1,165 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
 
5
 
6
- # This is an empty repo, which will be the home of Andy-4 when finished
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ datasets:
3
+ - Sweaterdog/Andy-4-base-1
4
+ - Sweaterdog/Andy-4-base-2
5
+ - Sweaterdog/Andy-4-ft
6
+ language:
7
+ - en
8
+ base_model:
9
+ - unsloth/Llama3.1-8B
10
+ tags:
11
+ - gaming
12
+ - minecraft
13
+ - mindcraft
14
  ---
15
 
16
+ # 🧠 Andy‑4 🧠
17
 
18
+ **Andy‑4** is an 8 billion‑parameter specialist model tuned for Minecraft gameplay via the Mindcraft framework. Trained on a single RTX 3090 over **three weeks**, Andy4 delivers advanced reasoning, multi‑step planning, and robust in‑game decision‑making.
19
+
20
+ > ⚠️ **Certification:**
21
+ > Andy‑4 is **not yet certified** by the Mindcraft developers. Use in production at your own discretion.
22
+
23
+ ---
24
+
25
+ ## 🔍 Model Specifications
26
+
27
+ - **Parameters:** 8 B
28
+ - **Training Hardware:** 1 × NVIDIA RTX 3090
29
+ - **Duration:** ~3 weeks total
30
+ - **Data Volumes:**
31
+ - **Messages:** 179 384
32
+ - **Tokens:** 425 535 198
33
+ - **Conversations:** 62 149
34
+
35
+ - **Base Architecture:** Llama 3.1 8B
36
+ - **License:** [Andy 1.1 License](LICENSE)
37
+ - **Repository:** https://huggingface.co/Sweaterdog/Andy‑4
38
+
39
+ ---
40
+
41
+ ## 📊 Training Regimen
42
+
43
+ 1. **Andy‑4‑base‑1** dataset
44
+ - **Epochs:** 2
45
+ - **Learning Rate:** 7e-5
46
+
47
+ 2. **Andy‑4‑base‑2** dataset
48
+ - **Epochs:** 4
49
+ - **Learning Rate:** 3e-7
50
+
51
+ 3. **Fine‑tune (FT) dataset**
52
+ - **Epochs:** 2.5
53
+ - **Learning Rate:** 2e-5
54
+
55
+ - **Optimizer:** AdamW_8bit with cosine decay
56
+ - **Quantization:** 4‑bit (`bnb-4bit`) for inference
57
+ - **Warm Up Steps:** 0.1% of each dataset
58
+
59
+ ---
60
+
61
+ ## 🚀 Installation
62
+
63
+ ### 1. Quick Hugging Face + Ollama *(Not recommended)*
64
+
65
+ 1. On the HF model page, click **Use this model → Ollama**.
66
+ 2. Choose your quantization (see table).
67
+ 3. Copy and run the provided `ollama run` command.
68
+
69
+ | Quantization | VRAM Required |
70
+ |--------------|---------------|
71
+ | F16 | 16 GB+ |
72
+ | Q5_K_M | 8 GB+ |
73
+ | Q4_K_M | 6–8 GB |
74
+ | Q3_K_M | 6 GB (low) |
75
+ | Q2_K | 4–6 GB (ultra)|
76
+
77
+ If you lack a GPU, check the [Mindcraft Discord guide](https://ptb.discord.com/channels/1303399789995626667/1347027684768878644/1347027684768878644) for free cloud setups.
78
+
79
+ ---
80
+
81
+ ### 2. Manual Download & Modelfile
82
+
83
+ 1. **Download**
84
+ - From the HF **Files** tab, grab your chosen `.GGUF` quant weights (e.g. `Andy-4.Q4_K_M.gguf`).
85
+ - Download the provided `Modelfile`.
86
+
87
+ Follow this table to choose your quantization, this is for a 8192 context window, the default, as well as a non-quantized context window.
88
+
89
+ | Quantization | VRAM Required |
90
+ |--------------|---------------|
91
+ | F16 | 16 GB+ |
92
+ | Q5_K_M | 8 GB+ |
93
+ | Q4_K_M | 6–8 GB |
94
+ | Q3_K_M | 6 GB (low) |
95
+ | Q2_K | 4–6 GB (ultra)|
96
+
97
+ 2. **Edit**
98
+
99
+ Change
100
+ ```text
101
+ FROM YOUR/PATH/HERE
102
+ ```
103
+ to
104
+ ```text
105
+ FROM /path/to/Andy-4.Q4_K_M.gguf
106
+ ```
107
+ *Optional*:
108
+ Increase the parameter `num_ctx` to a higher value for longer conversations if you:
109
+
110
+ **A.** Have extra VRAM
111
+
112
+ **B.** Quantized the context window
113
+
114
+ **C.** Can use a smaller model
115
+
116
+ 3. **Create**
117
+ ```bash
118
+ ollama create andy-4 -f Modelfile
119
+ ```
120
+
121
+ This registers the **Andy‑4** model locally.
122
+
123
+ ---
124
+
125
+ ## 🔧 Context‑Window Quantization
126
+
127
+ To lower VRAM use for context windows:
128
+
129
+ #### **Windows**
130
+
131
+ 1. Close Ollama.
132
+ 2. In **System Properties → Environment Variables**, add:
133
+ ```text
134
+ OLLAMA_FLASH_ATTENTION=1
135
+ OLLAMA_KV_CACHE_TYPE=q8_0 # or q4_0 for extra savings, but far more unstable
136
+ ```
137
+ 3. Restart Ollama.
138
+
139
+ #### **Linux/macOS**
140
+
141
+ ```bash
142
+ export OLLAMA_FLASH_ATTENTION=1
143
+ export OLLAMA_KV_CACHE_TYPE="q8_0" # or "q4_0", but far more unstable
144
+ ollama serve
145
+ ```
146
+
147
+ ---
148
+
149
+ ## 📌 Acknowledgments
150
+
151
+ <details>
152
+ <summary>Click to expand</summary>
153
+
154
+ - **Data & Models by:** @Sweaterdog
155
+ - **Framework:** Mindcraft (https://github.com/kolbytn/mindcraft)
156
+ - **LoRA Weights:** https://huggingface.co/Sweaterdog/Andy-4-LoRA
157
+ </details>
158
+
159
+ ---
160
+
161
+ ## ⚖️ License
162
+
163
+ See [Andy 1.1 License](LICENSE).
164
+
165
+ *This work uses data and models created by @Sweaterdog.*