topshik commited on
Commit
9756c2f
·
verified ·
1 Parent(s): a983232

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -15,7 +15,7 @@ base_model:
15
  # Model Description
16
  Mellum-4b-sft-python is a fine-tuned version of JetBrains' first open-source large language model (LLM) optimized for code-related tasks.
17
 
18
- Trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, Mellum-4b-sft-python is tailored specifically for code completion in Python.
19
  The model follows a LLaMA-style architecture with 4 billion parameters, making it efficient for both cloud inference (e.g., via vLLM) and local deployment (e.g., using llama.cpp or Ollama).
20
 
21
  Mellum was trained using Automatic Mixed Precision (AMP) with bf16 precision.
 
15
  # Model Description
16
  Mellum-4b-sft-python is a fine-tuned version of JetBrains' first open-source large language model (LLM) optimized for code-related tasks.
17
 
18
+ Pre-trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, and then fine-tuned, Mellum-4b-sft-python is tailored specifically for code completion in Python.
19
  The model follows a LLaMA-style architecture with 4 billion parameters, making it efficient for both cloud inference (e.g., via vLLM) and local deployment (e.g., using llama.cpp or Ollama).
20
 
21
  Mellum was trained using Automatic Mixed Precision (AMP) with bf16 precision.