Aisha Halder
Ahalder
·
AI & ML interests
AI & ML,Networking,P2P
Organizations
None yet
Embedding
Multimodal
Image generation
-
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion
Paper • 2401.13388 • Published • 12 -
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models
Paper • 2401.13974 • Published • 14 -
Runtime error420420
Real ESRGAN
🏃 -
Vchitect/Vchitect-2.0-2B
Text-to-Video • Updated • 9 • 39
NLP LLM
-
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 17 -
distilbert/distilbert-base-uncased-finetuned-sst-2-english
Text Classification • 0.1B • Updated • 3.24M • • 790 -
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Paper • 2401.14112 • Published • 21 -
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
Paper • 2401.04092 • Published • 22
Games
Video generattion
Recognition
Time series
SLM
Image Processing
Dataset
-
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 51 -
Nfiniteai/product-masks-sample
Viewer • Updated • 2.71k • 24 • 14 -
HuggingFaceFV/finevideo
Viewer • Updated • 39.5k • 3.53k • 317 -
rulins/MassiveDS-140B
Viewer • Updated • 3.08M • 1.69k • 7
Speech and Audio
-
facebook/wav2vec2-base-960h
Automatic Speech Recognition • 0.1B • Updated • 998k • 357 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 61 -
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer
Paper • 2409.10819 • Published • 20 -
jadechoghari/openmusic
Text-to-Audio • Updated • 55 • 67
Segmentation
RAG & Quering
papers
-
Runtime error8181
Dailypapershackernews
📈 -
Prithvi WxC: Foundation Model for Weather and Climate
Paper • 2409.13598 • Published • 45 -
TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles
Paper • 2410.05262 • Published • 11 -
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Paper • 2410.15316 • Published • 12
Agent
Time series
Embedding
SLM
Multimodal
Image Processing
Image generation
-
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion
Paper • 2401.13388 • Published • 12 -
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models
Paper • 2401.13974 • Published • 14 -
Runtime error420420
Real ESRGAN
🏃 -
Vchitect/Vchitect-2.0-2B
Text-to-Video • Updated • 9 • 39
Dataset
-
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 51 -
Nfiniteai/product-masks-sample
Viewer • Updated • 2.71k • 24 • 14 -
HuggingFaceFV/finevideo
Viewer • Updated • 39.5k • 3.53k • 317 -
rulins/MassiveDS-140B
Viewer • Updated • 3.08M • 1.69k • 7
NLP LLM
-
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 17 -
distilbert/distilbert-base-uncased-finetuned-sst-2-english
Text Classification • 0.1B • Updated • 3.24M • • 790 -
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Paper • 2401.14112 • Published • 21 -
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
Paper • 2401.04092 • Published • 22
Speech and Audio
-
facebook/wav2vec2-base-960h
Automatic Speech Recognition • 0.1B • Updated • 998k • 357 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 61 -
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer
Paper • 2409.10819 • Published • 20 -
jadechoghari/openmusic
Text-to-Audio • Updated • 55 • 67
Games
Segmentation
Video generattion
RAG & Quering
Recognition
papers
-
Runtime error8181
Dailypapershackernews
📈 -
Prithvi WxC: Foundation Model for Weather and Climate
Paper • 2409.13598 • Published • 45 -
TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles
Paper • 2410.05262 • Published • 11 -
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Paper • 2410.15316 • Published • 12