internlm
/

Intern-S1-mini-GGUF

Image-Text-to-Text

GGUF

English

chat

conversational

Model card Files Files and versions Community

unsubscribe commited on 4 days ago

Commit

242bbd5

verified ·

1 Parent(s): 5c03dbb

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ tags:
 - chat
 ---
-# Intern-S1-GGUF Model
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642695e5274e7ad464c8a5ba/E43cgEXBRWjVJlU_-hdh6.png)
@@ -23,7 +23,7 @@ tags:
 ## Introduction
-The `Intern-S1` model in GGUF format can be utilized by [llama.cpp](https://github.com/ggerganov/llama.cpp), a highly popular open-source framework for Large Language Model (LLM) inference, across a variety of hardware platforms, both locally and in the cloud.
 This repository offers `Intern-S1-mini` models in GGUF format in both half precision and various low-bit quantized versions, including `q8_0`.
 In the subsequent sections, we will first present the installation procedure, followed by an explanation of the model download process.
@@ -77,8 +77,8 @@ Here is an example of using the thinking system prompt.
 system_prompt="<|im_start|>system\nYou are an expert reasoner with extensive experience in all areas. You approach problems through systematic thinking and rigorous reasoning. Your response should reflect deep understanding and precise logical thinking, making your solution path and reasoning clear to others. Please put your thinking process within <think>...</think> tags.\n<|im_end|>\n"
 build/bin/llama-mtmd-cli \
-    --model Intern-S1-GGUF/f16/Intern-S1-mini-f16.gguf  \
-    --mmproj Intern-S1-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
     --predict 2048 \
     --ctx-size 8192 \
     --gpu-layers 100 \
@@ -96,8 +96,8 @@ Then input your question with image input as `/image xxx.jpg`.
 ```shell
 ./build/bin/llama-server \
-    --model Intern-S1-GGUF/f16/Intern-S1-mini-f16.gguf \
-    --mmproj Intern-S1-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
     --gpu-layers 100 \
     --temp 0.8 \
     --top-p 0.8 \
@@ -135,8 +135,8 @@ print(response)
 # install ollama
 curl -fsSL https://ollama.com/install.sh | sh
 # fetch model
-ollama pull internlm/interns1
 # run model
-ollama run internlm/interns1
 # then use openai client to call on http://localhost:11434/v1
 ```

 - chat
 ---
+# Intern-S1-mini-GGUF Model
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642695e5274e7ad464c8a5ba/E43cgEXBRWjVJlU_-hdh6.png)
 ## Introduction
+The `Intern-S1-mini` model in GGUF format can be utilized by [llama.cpp](https://github.com/ggerganov/llama.cpp), a highly popular open-source framework for Large Language Model (LLM) inference, across a variety of hardware platforms, both locally and in the cloud.
 This repository offers `Intern-S1-mini` models in GGUF format in both half precision and various low-bit quantized versions, including `q8_0`.
 In the subsequent sections, we will first present the installation procedure, followed by an explanation of the model download process.
 system_prompt="<|im_start|>system\nYou are an expert reasoner with extensive experience in all areas. You approach problems through systematic thinking and rigorous reasoning. Your response should reflect deep understanding and precise logical thinking, making your solution path and reasoning clear to others. Please put your thinking process within <think>...</think> tags.\n<|im_end|>\n"
 build/bin/llama-mtmd-cli \
+    --model Intern-S1-mini-GGUF/f16/Intern-S1-mini-f16.gguf  \
+    --mmproj Intern-S1-mini-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
     --predict 2048 \
     --ctx-size 8192 \
     --gpu-layers 100 \
 ```shell
 ./build/bin/llama-server \
+    --model Intern-S1-mini-GGUF/f16/Intern-S1-mini-f16.gguf \
+    --mmproj Intern-S1-mini-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
     --gpu-layers 100 \
     --temp 0.8 \
     --top-p 0.8 \
 # install ollama
 curl -fsSL https://ollama.com/install.sh | sh
 # fetch model
+ollama pull internlm/interns1:mini
 # run model
+ollama run internlm/interns1:mini
 # then use openai client to call on http://localhost:11434/v1
 ```