Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.00824

Tavernari/git-commit-message

Summarization • Updated 28 days ago • 549 • 13
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published 6 days ago • 37

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 96
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 42
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 37
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 51
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 67
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 49

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 29
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published 6 days ago • 37

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 57

about 20 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 28
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 41
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 53
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Paper • 2401.12474 • Published Jan 23, 2024 • 36
More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3, 2024 • 53
VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Paper • 2403.10517 • Published Mar 15, 2024 • 35
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 118

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs