Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.14456

about 12 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 26
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 43
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

about 2 hours ago

fla-hub/rwkv7-2.9B-world

Text Generation • Updated 4 days ago • 588 • 4
fla-hub/rwkv7-1.5B-world

Text Generation • Updated 5 days ago • 632 • 9
fla-hub/rwkv7-191M-world

Text Generation • Updated 5 days ago • 331 • 1
fla-hub/rwkv7-168M-pile

Text Generation • Updated 5 days ago • 136 • 5

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 26
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 85
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 21
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 7

Trellis Networks for Sequence Modeling

Paper • 1810.06682 • Published Oct 15, 2018 • 1
ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-like Language Models

Paper • 2311.01981 • Published Nov 3, 2023 • 1
Gated recurrent neural networks discover attention

Paper • 2309.01775 • Published Sep 4, 2023 • 10
Inverse Approximation Theory for Nonlinear Recurrent Neural Networks

Paper • 2305.19190 • Published May 30, 2023 • 1

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published 19 days ago • 215
Transformers without Normalization

Paper • 2503.10622 • Published 11 days ago • 133
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published 6 days ago • 127
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published 10 days ago • 117

about 16 hours ago

RuCCoD: Towards Automated ICD Coding in Russian

Paper • 2502.21263 • Published 24 days ago • 123
Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published 17 days ago • 107
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published 17 days ago • 43
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Paper • 2503.05592 • Published 17 days ago • 25

RWKV-7 Goose related resources.

Goose-World/RWKV-World-v3

Viewer • Updated 5 days ago • 1.1M • 589 • 1
BlinkDL/rwkv-7-world

Text Generation • Updated Feb 10 • 88
BlinkDL/rwkv-7-pile

Updated Dec 19, 2024 • 15
Running

2

2

RWKV 7

🌏

best foundation model for its size !

about 11 hours ago

Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM

Paper • 2502.06635 • Published Feb 10 • 4
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 208
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 94
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 276

interesting papers

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 126
Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 22
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 46
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 28

Daily Research Papers

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 58
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published 6 days ago • 127
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published 5 days ago • 42
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published 6 days ago • 35

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs