Safetensors
qwen3

Better TAMA Models with Limited Data

In [1], we reveal that with limited instruction tuning data, we can achieve competitive performance on table tasks. This compact setup enables quick instruction tuning with advanced base models.

We present TAMA models built on Qwen 2.5 and Qwen 3. These models achieve strong results on the MMTU benchmark [2], outperforming recent table reasoning models [3] and competitive table LLMs like Table-GPT 2 [4], which is tuned on 2.36M datapoints.

Notably, TAMA-QWen3 achieves the best overall performance of 33.9, surpassing QWen-3-8B (32.9) and TableGPT-2 (30.0).

Models Paper Source Training Corpora Size Base Model Model Size Overall Table Understanding and QA Table Transformation and Manipulation Entity and Schema Matching SQL and Table Navigation Semantic Analysis and Relationships Cell and Column Annotation Error Detection Formula Prediction
TableLlama
TableLlama: Towards Open Large Generalist Models for Tables
2M
Yukang/Llama-2-7b-longlora-8k
7B
0.0
0
0
0
0
0
0
0
0
TableLLM
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
309K
codellama/CodeLlama-7b-Instruct-hf
7B
2.5
2.9
0.1
16.6
2.3
1.4
2.6
0
0
TableBenchLLM-Llama-3.1-8B
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
20K
meta-llama/Llama-3.1-8B
8B
3.4
0.1
3.1
23.3
0.8
2.1
0
0
0
Llama-3.1-8B-Instruct
The Llama 3 Herd of Models
-
meta-llama/Llama-3.1-8B-Instruct
8B
25.3
38.1
17.1
67.8
26.2
28
24.9
4.5
0.1
TAMA-vB
Rethinking Table Instruction Tuning
2.6K
meta-llama/Llama-3.1-8B-Instruct
8B
21.1
30.9
15.2
73.7
14.3
20.9
18.7
5.9
0.1
TAMA-vA
Rethinking Table Instruction Tuning
2.6K
meta-llama/Llama-3.1-8B-Instruct
8B
16.9
22.6
13.5
41.8
12.7
21.1
17.1
6.3
0.1
Qwen2.5-7B
Qwen2.5 Technical Report
-
Qwen/Qwen2.5-7B-Instruct
7B
28.5
38.5
18.6
87.3
30.5
32.3
23.2
7.9
0.3
TableGPT2-7B
TableGPT2: A Large Multimodal Model with Tabular Data Integration
2.36M
Qwen/Qwen2.5-7B-Instruct
7B
30.0
42.6
22.9
84.9
31.1
32.4
24.7
14.8
0.3
Table-R1-Zero-8B
Table-R1: Inference-Time Scaling for Table Reasoning
48.6K
meta-llama/Llama-3.1-8B-Instruct
8B
26.6
39.4
19.6
66
26.2
31
25.9
3.8
0.2
TAMA-QWen2.5
Rethinking Table Instruction Tuning
2.6K
Qwen/Qwen2.5-7B-Instruct
7B
27.6
36.5
17.1
86.8
29.9
30.8
23.6
8.4
0.2
QWen-3-8B
Qwen3 Technical Report
-
Qwen/Qwen3-8B
8B
32.9
37.4
25.7
83.2
28.9
38.9
27.6
14.2
0.6
TAMA-QWen3
Rethinking Table Instruction Tuning
2.6K
Qwen/Qwen3-8B
8B
33.9
38.2
24.8
83
29.3
38.3
26.6
13.9
0.6

Evaluation Details

We adopt the official MMTU evaluation script to compute scores. For overall performance, we use the evaluation function described here. Category scores are the arithmetic mean across datasets in that category. For QWen 3 model and TAMA-QWen3, we turned off the thinking mode.


References

Downloads last month
4
Safetensors
Model size
8.19B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MichiganNLP/TAMA-QWen3

Quantizations
2 models

Collection including MichiganNLP/TAMA-QWen3