Qian Liu's picture

Qian Liu

SivilTaram

·

http://siviltaram.github.io/

AI & ML interests

Cooking cool things

Recent Activity

published a model 1 day ago

SivilTaram/tongyao_models_v3

updated a model 5 days ago

SivilTaram/tongyao_models_v2

updated a model 5 days ago

SivilTaram/tongyao_models_v2

View all activity

Organizations

commented a paper 15 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published 15 days ago • 40 •

commented a paper 28 days ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Paper • 2507.01004 • Published about 1 month ago • 10 •

New activity in sail/Sailor2-20B 3 months ago

Improve language tag

#4 opened 3 months ago by

commented a paper 4 months ago

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Paper • 2503.15450 • Published Mar 19 • 12 •

New activity in OpenCoder-LLM/opc-sft-stage1 9 months ago

License

#5 opened 9 months ago by

New activity in OpenCoder-LLM/opc-annealing-corpus 9 months ago

License

#3 opened 9 months ago by

New activity in OpenCoder-LLM/opc-fineweb-code-corpus 9 months ago

Code elements inside web page are badly processed for FineWeb

#2 opened 9 months ago by

commented a paper 10 months ago

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Paper • 2410.07137 • Published Oct 9, 2024 • 8 •

New activity in SivilTaram/starcoder2-documentation 10 months ago

release plan for the rest of the-stack-v2-train-extras

#2 opened 10 months ago by

New activity in microsoft/tapex-large-finetuned-wtq 11 months ago

is it possible to support multiple languages, like Chinese?

#5 opened about 1 year ago by

New activity in bigcode/the-stack-v2 11 months ago

"Documentation" data?

#8 opened over 1 year ago by

Where is the-stack-v2-train-extras?

#17 opened over 1 year ago by

question about starcoder 2 jupyter notebook conversion

#29 opened about 1 year ago by

commented a paper 12 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57 •

commented 2 papers about 1 year ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57 •

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57 •

New activity in sail/regmix-data about 1 year ago

[bot] Conversion to Parquet

#1 opened about 1 year ago by

parquet-converter

New activity in sail/regmix-data-sample about 1 year ago

[bot] Conversion to Parquet

#1 opened about 1 year ago by

parquet-converter

commented 2 papers about 1 year ago

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1, 2024 • 41 •

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1, 2024 • 41 •