JetBrains
/

Mellum-4b-sft-python

Text Generation

text-generation-inference

Model card Files Files and versions

topshik commited on Apr 30

Commit

9756c2f

·

verified ·

1 Parent(s): a983232

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ base_model:
 # Model Description
 Mellum-4b-sft-python is a fine-tuned version of JetBrains' first open-source large language model (LLM) optimized for code-related tasks.
-Trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, Mellum-4b-sft-python is tailored specifically for code completion in Python.
 The model follows a LLaMA-style architecture with 4 billion parameters, making it efficient for both cloud inference (e.g., via vLLM) and local deployment (e.g., using llama.cpp or Ollama).
 Mellum was trained using Automatic Mixed Precision (AMP) with bf16 precision.

 # Model Description
 Mellum-4b-sft-python is a fine-tuned version of JetBrains' first open-source large language model (LLM) optimized for code-related tasks.
+Pre-trained on over 4 trillion tokens with a context window of 8192 tokens across multiple programming languages, and then fine-tuned, Mellum-4b-sft-python is tailored specifically for code completion in Python.
 The model follows a LLaMA-style architecture with 4 billion parameters, making it efficient for both cloud inference (e.g., via vLLM) and local deployment (e.g., using llama.cpp or Ollama).
 Mellum was trained using Automatic Mixed Precision (AMP) with bf16 precision.