onekq commited on
Commit
76b5d41
·
verified ·
1 Parent(s): d08ed8e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: onekq-ai/OneSQL-v0.1-Qwen-3B
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - qwen2
7
+ - gguf
8
+ license: apache-2.0
9
+ language:
10
+ - en
11
+ ---
12
+
13
+ # Introduction
14
+
15
+ This model is the GGUF version of [OneSQL-v0.1-Qwen-3B](https://huggingface.co/onekq-ai/OneSQL-v0.1-Qwen-3B). You can also find it on [Ollama](https://ollama.com/onekq/OneSQL-v0.1-Qwen).
16
+
17
+ # Performances
18
+
19
+ The self-evaluation EX score of the original model is **43.35** (compared to **63.33** by the 32B model on the [BIRD leaderboard](https://bird-bench.github.io/).
20
+ Below is the self-evaluation results for each quantization.
21
+
22
+ | Quantization |EX score|
23
+ |------------|------|
24
+ | Q4_0 | 16.83 |
25
+ | Q4_1 | 21.85 |
26
+ | Q4_K_S | 22.49 |
27
+ | Q4_K_M | 21.85 |
28
+ | Q5_0 | 23.40 |
29
+ | Q5_1 | 23.53 |
30
+ | Q5_K_S | 22.77 |
31
+ | Q5_K_M | 23.73 |
32
+ | Q6_K | 24.51 |
33
+ | **Q8_0** | **24.90** |
34
+
35
+ # Quick start
36
+
37
+ To use this model, craft your prompt to start with your database schema in the form of **CREATE TABLE**, followed by your natural language query preceded by **--**.
38
+ Make sure your prompt ends with **SELECT** in order for the model to finish the query for you. There is no need to set other parameters like temperature or max token limit.
39
+
40
+ ```sh
41
+ PROMPT="CREATE TABLE students (
42
+ id INTEGER PRIMARY KEY,
43
+ name TEXT,
44
+ age INTEGER,
45
+ grade TEXT
46
+ );
47
+
48
+ -- Find the three youngest students
49
+ SELECT "
50
+
51
+ ollama run onekq-ai/OneSQL-v0.1-Qwen:3B-Q8_0 "$PROMPT"
52
+ ```
53
+
54
+ The model response is the finished SQL query without **SELECT**
55
+ ```sql
56
+ * FROM students ORDER BY age ASC LIMIT 3
57
+ ```
58
+
59
+ # Caveats
60
+
61
+ * The performance drop from the original model is due to quantization itself, and the lack of beam search support in llama.cpp framework. Use at your own discretion.
62
+ * The 2-bit and 3-bit quantizations suffer from repetitive and unrelevant output token, hence are not recommended for usage.