-
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 50 -
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Paper • 2311.05698 • Published • 14 -
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Paper • 2311.06242 • Published • 94 -
PolyMaX: General Dense Prediction with Mask Transformer
Paper • 2311.05770 • Published • 11
Chenhui Zhang PRO
danielz01
AI & ML interests
MIT IDSS | Illinois CS 23' | ML for Remote Sensing & Climate Change | Trustworthy ML
Recent Activity
liked
a model
about 1 month ago
ByteDance-Seed/BAGEL-7B-MoT
updated
a dataset
about 2 months ago
danielz01/deepfried-dd
updated
a dataset
about 2 months ago
danielz01/deepfried-dd
Organizations
Agents
LLM Tuning
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 150 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 24 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 32 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 51
Model Editing
ViT
Efficient LLM
-
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 37 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 32 -
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Paper • 2311.06243 • Published • 22 -
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Paper • 2311.05908 • Published • 16
Image Generation
-
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Paper • 2401.11605 • Published • 23 -
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Paper • 2401.11708 • Published • 31 -
Multi-LoRA Composition for Image Generation
Paper • 2402.16843 • Published • 33 -
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Paper • 2404.02905 • Published • 71
Uncertainty
Synthetic Data
VLFM
-
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 50 -
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Paper • 2311.05698 • Published • 14 -
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Paper • 2311.06242 • Published • 94 -
PolyMaX: General Dense Prediction with Mask Transformer
Paper • 2311.05770 • Published • 11
Efficient LLM
-
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 37 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 32 -
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Paper • 2311.06243 • Published • 22 -
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Paper • 2311.05908 • Published • 16
Agents
Image Generation
-
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Paper • 2401.11605 • Published • 23 -
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Paper • 2401.11708 • Published • 31 -
Multi-LoRA Composition for Image Generation
Paper • 2402.16843 • Published • 33 -
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Paper • 2404.02905 • Published • 71
LLM Tuning
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 150 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 24 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 32 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 51
Uncertainty
Model Editing
Synthetic Data
ViT