michael
netzkontrast
AI & ML interests
None yet
Organizations
None yet
Speech
Lora
Image
-
Customizing Text-to-Image Models with a Single Image Pair
Paper • 2405.01536 • Published • 23 -
Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models
Paper • 2404.03913 • Published -
LCM-Lookahead for Encoder-based Text-to-Image Personalization
Paper • 2404.03620 • Published • 1 -
Customizing Text-to-Image Diffusion with Camera Viewpoint Control
Paper • 2404.12333 • Published • 1
LLMs
-
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 16 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 99 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 92 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 100
Performance
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Paper • 2402.09025 • Published • 9 -
Shortened LLaMA: A Simple Depth Pruning for Large Language Models
Paper • 2402.02834 • Published • 17 -
Algorithmic progress in language models
Paper • 2403.05812 • Published • 21
Video
-
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
Paper • 2405.01434 • Published • 57 -
TransPixar: Advancing Text-to-Video Generation with Transparency
Paper • 2501.03006 • Published • 27 -
CPA: Camera-pose-awareness Diffusion Transformer for Video Generation
Paper • 2412.01429 • Published -
Ingredients: Blending Custom Photos with Video Diffusion Transformers
Paper • 2501.01790 • Published • 8
music
LLMs
-
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 16 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 99 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 92 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 100
Speech
Performance
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Paper • 2402.09025 • Published • 9 -
Shortened LLaMA: A Simple Depth Pruning for Large Language Models
Paper • 2402.02834 • Published • 17 -
Algorithmic progress in language models
Paper • 2403.05812 • Published • 21
Lora
Video
-
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
Paper • 2405.01434 • Published • 57 -
TransPixar: Advancing Text-to-Video Generation with Transparency
Paper • 2501.03006 • Published • 27 -
CPA: Camera-pose-awareness Diffusion Transformer for Video Generation
Paper • 2412.01429 • Published -
Ingredients: Blending Custom Photos with Video Diffusion Transformers
Paper • 2501.01790 • Published • 8
Image
-
Customizing Text-to-Image Models with a Single Image Pair
Paper • 2405.01536 • Published • 23 -
Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models
Paper • 2404.03913 • Published -
LCM-Lookahead for Encoder-based Text-to-Image Personalization
Paper • 2404.03620 • Published • 1 -
Customizing Text-to-Image Diffusion with Camera Viewpoint Control
Paper • 2404.12333 • Published • 1