unsubscribe commited on
Commit
242bbd5
·
verified ·
1 Parent(s): 5c03dbb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
  - chat
13
  ---
14
 
15
- # Intern-S1-GGUF Model
16
 
17
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642695e5274e7ad464c8a5ba/E43cgEXBRWjVJlU_-hdh6.png)
18
 
@@ -23,7 +23,7 @@ tags:
23
 
24
  ## Introduction
25
 
26
- The `Intern-S1` model in GGUF format can be utilized by [llama.cpp](https://github.com/ggerganov/llama.cpp), a highly popular open-source framework for Large Language Model (LLM) inference, across a variety of hardware platforms, both locally and in the cloud.
27
  This repository offers `Intern-S1-mini` models in GGUF format in both half precision and various low-bit quantized versions, including `q8_0`.
28
 
29
  In the subsequent sections, we will first present the installation procedure, followed by an explanation of the model download process.
@@ -77,8 +77,8 @@ Here is an example of using the thinking system prompt.
77
  system_prompt="<|im_start|>system\nYou are an expert reasoner with extensive experience in all areas. You approach problems through systematic thinking and rigorous reasoning. Your response should reflect deep understanding and precise logical thinking, making your solution path and reasoning clear to others. Please put your thinking process within <think>...</think> tags.\n<|im_end|>\n"
78
 
79
  build/bin/llama-mtmd-cli \
80
- --model Intern-S1-GGUF/f16/Intern-S1-mini-f16.gguf  \
81
- --mmproj Intern-S1-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
82
  --predict 2048 \
83
  --ctx-size 8192 \
84
  --gpu-layers 100 \
@@ -96,8 +96,8 @@ Then input your question with image input as `/image xxx.jpg`.
96
 
97
  ```shell
98
  ./build/bin/llama-server \
99
- --model Intern-S1-GGUF/f16/Intern-S1-mini-f16.gguf \
100
- --mmproj Intern-S1-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
101
  --gpu-layers 100 \
102
  --temp 0.8 \
103
  --top-p 0.8 \
@@ -135,8 +135,8 @@ print(response)
135
  # install ollama
136
  curl -fsSL https://ollama.com/install.sh | sh
137
  # fetch model
138
- ollama pull internlm/interns1
139
  # run model
140
- ollama run internlm/interns1
141
  # then use openai client to call on http://localhost:11434/v1
142
  ```
 
12
  - chat
13
  ---
14
 
15
+ # Intern-S1-mini-GGUF Model
16
 
17
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642695e5274e7ad464c8a5ba/E43cgEXBRWjVJlU_-hdh6.png)
18
 
 
23
 
24
  ## Introduction
25
 
26
+ The `Intern-S1-mini` model in GGUF format can be utilized by [llama.cpp](https://github.com/ggerganov/llama.cpp), a highly popular open-source framework for Large Language Model (LLM) inference, across a variety of hardware platforms, both locally and in the cloud.
27
  This repository offers `Intern-S1-mini` models in GGUF format in both half precision and various low-bit quantized versions, including `q8_0`.
28
 
29
  In the subsequent sections, we will first present the installation procedure, followed by an explanation of the model download process.
 
77
  system_prompt="<|im_start|>system\nYou are an expert reasoner with extensive experience in all areas. You approach problems through systematic thinking and rigorous reasoning. Your response should reflect deep understanding and precise logical thinking, making your solution path and reasoning clear to others. Please put your thinking process within <think>...</think> tags.\n<|im_end|>\n"
78
 
79
  build/bin/llama-mtmd-cli \
80
+ --model Intern-S1-mini-GGUF/f16/Intern-S1-mini-f16.gguf  \
81
+ --mmproj Intern-S1-mini-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
82
  --predict 2048 \
83
  --ctx-size 8192 \
84
  --gpu-layers 100 \
 
96
 
97
  ```shell
98
  ./build/bin/llama-server \
99
+ --model Intern-S1-mini-GGUF/f16/Intern-S1-mini-f16.gguf \
100
+ --mmproj Intern-S1-mini-GGUF/f16/mmproj-Intern-S1-mini-f16.gguf \
101
  --gpu-layers 100 \
102
  --temp 0.8 \
103
  --top-p 0.8 \
 
135
  # install ollama
136
  curl -fsSL https://ollama.com/install.sh | sh
137
  # fetch model
138
+ ollama pull internlm/interns1:mini
139
  # run model
140
+ ollama run internlm/interns1:mini
141
  # then use openai client to call on http://localhost:11434/v1
142
  ```