Li Tan PRO
tanliboy
AI & ML interests
None yet
Recent Activity
liked
a model
about 2 months ago
deepseek-ai/DeepSeek-R1
updated
a model
2 months ago
tanliboy/Qwen2.5-14B-Instruct-1M-AWQ
published
a model
2 months ago
tanliboy/Qwen2.5-14B-Instruct-1M-AWQ
Organizations
tanliboy's activity
what is your "continuous finetuning"
1
7
#2 opened 6 months ago
by
MaziyarPanahi

Batch Inference causes degraded performance
1
3
#43 opened 8 months ago
by
tanliboy

Scorecard on popular benchmarks
2
#2 opened 7 months ago
by
tanliboy

Phi-2-Instruct-APO: aligned with Anchored Preference Optimization
16
#3 opened 7 months ago
by
rasyosef
Preference Alignment
4
#6 opened 7 months ago
by
tanliboy

Text Classification with LLMs
7
#30 opened 8 months ago
by
dss107
Qwen 2.5 1.5B retrain?
1
5
#12 opened 7 months ago
by
tomaarsen

GSM8K Evaluation Result: 84.5 vs. 76.95
17
#81 opened 9 months ago
by
tanliboy

Finetuning script using HuggingFace (No llama-factory)
11
39
#32 opened 7 months ago
by
2U1
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8
#120 opened 8 months ago
by
erildo
Have you deleted your GitHub page?
7
#10 opened 7 months ago
by
xwzy6
Sliding window vs. Global Attention
6
#41 opened 8 months ago
by
tanliboy

Gemma2-2b training uses much more momory!
2
#23 opened 8 months ago
by
bubbleseller
GemmaSdpaAttention vs GemmaAttention
2
#71 opened 8 months ago
by
canqin001
Fix Llama 3.1 Chat Template to Properly Handle add_generation_prompt
1
9
#26 opened 8 months ago
by
Tostino
🍭 Fine-tuning support for Qwen2-VL-7B-Instruct
5
#1 opened 8 months ago
by
study-hjt

How is this dataset supposed to be used to evaluate the model?
4
#1 opened 8 months ago
by
realdanielbyrne
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
2
#18 opened 9 months ago
by
lcahill
Llama-3-Instruct with Langchain keeps talking to itself
5
11
#147 opened 10 months ago
by
fahim9778

Pruning
7
#24 opened 8 months ago
by
dhivakarsa