To read once X
Paper • 2401.15024 • Published • 69Note Apply a transformation around the weight Q since this trans does not affects the nonlinearity it may be use to spare end incorporate into W matrix without adding errors.
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Paper • 2401.14112 • Published • 18Note To address these problems, we pro- pose TC-FPx, the first full-stack GPU kernel design scheme with unified Tensor Core support of float-point weights for var- ious quantization bit-width. We integrate TC-FPx kernel into an existing inference system, providing new end-to-end sup- port (called FP6-LLM) for quantized LLM inference, where better trade-offs between inference cost and model quality are achieved. https://github.com/usyd-fsalab/fp6_llm
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Paper • 2401.13919 • Published • 26Note https://github.com/MinorJerry/WebVoyager useful for automation see UI Path tipe of things.
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper • 2401.14257 • Published • 10
Make-A-Shape: a Ten-Million-scale 3D Shape Model
Paper • 2401.11067 • Published • 16Note Voxel power encoder decoder
Single-View 3D Human Digitalization with Large Reconstruction Models
Paper • 2401.12175 • Published • 6WARM: On the Benefits of Weight Averaged Reward Models
Paper • 2401.12187 • Published • 18
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 602Note Amazing one bit paper not quantisation but the model is build with 1-2 bit layers.
Towards Optimal Learning of Language Models
Paper • 2402.17759 • Published • 16Note I think this is a paper that is worth reading because it is compatible with compactifAI they also have the code but it semas to be applied only on a toy model : https://arxiv.org/pdf/2402.17759.pdf
Humanoid Locomotion as Next Token Prediction
Paper • 2402.19469 • Published • 26
OneBit: Towards Extremely Low-bit Large Language Models
Paper • 2402.11295 • Published • 22Note Sign-Value-Independent Decomposition I think this can be applied on top of the MPO
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 15Note leader takes decision