Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 7 days ago • 97
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence Paper • 2503.20533 • Published 20 days ago • 11
bloom-testing/test-bloomd-560m-37ba0c084a0d6bf37b9b592932523768eb3ad4307f57cb200b6c5f9ca3c7ac56 Text Generation • Updated Jun 16, 2023 • 2
bloom-testing/test-bloomd-560m-db788ae2594f597e839fb48fedb0895f04d853006df99f79d446b6b29c715eb7 Text Generation • Updated Jun 15, 2023 • 10
bloom-testing/test-bloomd-560m-006afb25d79d1a06fd2be5e9451dc43038acc5bc26b803b9d7ce3b7f698af77e Text Generation • Updated Jun 10, 2023 • 2