Update README.md
Browse files
README.md
CHANGED
@@ -212,10 +212,10 @@ model-index:
|
|
212 |
---
|
213 |
|
214 |
# Model Description
|
215 |
-
Mellum-base
|
216 |
|
217 |
-
Trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, Mellum-base
|
218 |
-
The model follows a LLaMA-style architecture with 4 billion parameters
|
219 |
|
220 |
Mellum was trained using Automatic Mixed Precision (AMP) with bf16 precision.
|
221 |
The uploaded version on Hugging Face retains the bf16 format for public use.
|
|
|
212 |
---
|
213 |
|
214 |
# Model Description
|
215 |
+
Mellum-4b-base is JetBrains' first open-source large language model (LLM) optimized for code-related tasks.
|
216 |
|
217 |
+
Trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, Mellum-4b-base is tailored specifically for code completion.
|
218 |
+
The model follows a LLaMA-style architecture with 4 billion parameters, making it efficient for both cloud inference (e.g., via vLLM) and local deployment (e.g., using llama.cpp or Ollama).
|
219 |
|
220 |
Mellum was trained using Automatic Mixed Precision (AMP) with bf16 precision.
|
221 |
The uploaded version on Hugging Face retains the bf16 format for public use.
|