Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published 11 days ago • 7
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset Paper • 2412.02595 • Published Dec 3, 2024 • 3
Building a Large Japanese Web Corpus for Large Language Models Paper • 2404.17733 • Published Apr 27, 2024 • 4
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities Paper • 2404.17790 • Published Apr 27, 2024 • 5
Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese Paper • 2404.07824 • Published Apr 11, 2024 • 3