-
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 14 -
ProTIP: Progressive Tool Retrieval Improves Planning
Paper • 2312.10332 • Published • 8 -
Paloma: A Benchmark for Evaluating Language Model Fit
Paper • 2312.10523 • Published • 13 -
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 93
daje kang
daje
AI & ML interests
None yet
Recent Activity
updated
a model
about 8 hours ago
daje/model_Lora
published
a model
about 8 hours ago
daje/model_Lora
updated
a model
about 8 hours ago
daje/model_2e-4
Organizations
None yet
Collections
1
models
33
daje/model_Lora
Updated
•
2
daje/model_2e-4
Updated
•
4
daje/model
Updated
•
1
daje/Qwen2-7B-Instruct-harmful_detector_2000-H100_1
Updated
•
13
daje/Qwen2-VL-7B-instruct-ScienceQA
Updated
•
9
daje/qwen2-7b-instruct-harmful-detector-8500
Image-Text-to-Text
•
Updated
•
12
daje/qwen2-7b-instruct-hamful-detector
Image-Text-to-Text
•
Updated
•
8
daje/Qwen2.5-coder-7B-en-all-merged
Text Generation
•
Updated
•
16
daje/Qwen2.5-coder-7B-ko-all
Updated
daje/llama3-8B-ko-all
Updated
datasets
13
daje/de-identify-chat-ko
Viewer
•
Updated
•
9.92k
•
76
daje/ko-hatefulmemes_train_8500
Viewer
•
Updated
•
8.2k
•
122
daje/ko-hatefulmemes_train_8500_kmhas
Viewer
•
Updated
•
95.3k
•
60
daje/ko-hatefulmemes_train_2000
Viewer
•
Updated
•
1.91k
•
49
daje/Ko-SciecneQA
Viewer
•
Updated
•
12.7k
•
48
daje/keyword_summary
Viewer
•
Updated
•
1k
•
157
daje/kotext-to-sql-v1
Viewer
•
Updated
•
262k
•
104
•
2
daje/mistral_tokenized_en_wiki
Viewer
•
Updated
•
16.1M
•
238
daje/mistral_tokenized_ko_wiki
Viewer
•
Updated
•
1.7M
•
70
daje/tokenized_enwiki
Viewer
•
Updated
•
16.4M
•
115