-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 19 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 38 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 56 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 23
Jason Wolosonovich
wolosonovich
AI & ML interests
None yet
Organizations
Legal LLMs
-
Equall/perplexity_evaluation
Viewer • Updated • 3.13k • 31 • 3 -
Equall/Saul-7B-Base
Text Generation • 7B • Updated • 90 • 32 -
Equall/Saul-7B-Instruct-v1
Text Generation • 7B • Updated • 1.33k • 98 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 88
Research
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 19 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 38 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 56 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 23
Legal LLMs
-
Equall/perplexity_evaluation
Viewer • Updated • 3.13k • 31 • 3 -
Equall/Saul-7B-Base
Text Generation • 7B • Updated • 90 • 32 -
Equall/Saul-7B-Instruct-v1
Text Generation • 7B • Updated • 1.33k • 98 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 88
models
0
None public yet
datasets
0
None public yet