AI at Meta

Enterprise

company

Verified

https://ai.facebook.com/

facebookresearch

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

htranx updated a Space 3 days ago

facebook/wearable-ai-leaderboard

kunhe001 updated a dataset 3 days ago

facebook/show3d-dataset

pierrechambon submitted a paper 4 days ago

Reinforcement Learning for Code Optimization

View all activity

Papers

Reinforcement Learning for Code Optimization

Two-Level Meta-Rubrics for Evaluating Open-Ended Generation: GAMUT, a Benchmark for Factual Completeness

View all Papers

facebook 's collections 47

Sapiens2

facebook/sapiens2

Updated May 15 • 156
Running on Zero

Agents

14

Sapiens2 Pose

🧬

14

Detect and visualize detailed human poses in images
Running on Zero

Agents

Featured

45

Sapiens2 Seg

🧩

45

Segment human body parts in an image (29 classes)
Running on Zero

Agents

Featured

21

Sapiens2 Normal

🧊

21

Generate per-pixel surface normal maps from photos

LagerNVS

Latent Geometry for Fully Neural Real-time Novel View Synthesis

facebook/lagernvs_general_512

Updated May 21 • 1.79k • 30
facebook/lagernvs_re10k_2v_256

Updated Mar 26 • 128 • 6
facebook/lagernvs_dl3dv_2-6_v_256

Updated Mar 26 • 87 • 6

perception-encoder-audio-visual

facebook/pe-av-small

0.8B • Updated Jan 29 • 3.42k • 24
facebook/pe-av-base

1B • Updated Jan 29 • 507 • 13
facebook/pe-av-large

2B • Updated Jan 29 • 33.8k • 65
facebook/pe-av-small-16-frame

0.8B • Updated Jan 29 • 124 • 6

Pixio

facebook/pixio-vitb16

Image Feature Extraction • 85.9M • Updated Dec 26, 2025 • 458 • 15
facebook/pixio-vitl16

Image Feature Extraction • 0.3B • Updated Dec 26, 2025 • 14 • 9
facebook/pixio-vith16

Image Feature Extraction • 0.6B • Updated Dec 29, 2025 • 28 • 6
facebook/pixio-vit1b16

Image Feature Extraction • 1B • Updated Dec 26, 2025 • 1.39k • 8

Meta CLIP 2

facebook/metaclip-2-worldwide-huge-quickgelu

Zero-Shot Image Classification • 2B • Updated Aug 18, 2025 • 23.7k • 18
facebook/metaclip-2-worldwide-huge-378

Zero-Shot Image Classification • 2B • Updated Aug 18, 2025 • 13.2k • 7
facebook/metaclip-2-worldwide-giant

Zero-Shot Image Classification • 4B • Updated Aug 18, 2025 • 2.81k • 8
facebook/metaclip-2-worldwide-giant-378

Zero-Shot Image Classification • 4B • Updated Aug 18, 2025 • 23.9k • 13

MobileLLM-Pro

facebook/MobileLLM-Pro

Text Generation • 1B • Updated Nov 11, 2025 • 132 • 164
facebook/MobileLLM-Pro-base

Text Generation • 1B • Updated Nov 11, 2025 • 5.01k • 11
facebook/MobileLLM-Pro-base-int4-cpu

Text Generation • Updated Nov 11, 2025 • 2 • 4
facebook/MobileLLM-Pro-base-int4-accelerator

Text Generation • Updated Nov 11, 2025 • 2 • 2

MobileLLM-R1

MobileLLM-R1, a series of sub-billion parameter reasoning models

facebook/MobileLLM-R1-950M

Text Generation • 0.9B • Updated Sep 30, 2025 • 265 • 359
facebook/MobileLLM-R1-360M

Text Generation • 0.4B • Updated Nov 10, 2025 • 190 • 23
facebook/MobileLLM-R1-140M

Text Generation • 0.1B • Updated Nov 10, 2025 • 117 • 37
facebook/MobileLLM-R1-950M-base

Text Generation • 0.9B • Updated Sep 30, 2025 • 29 • 20

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19, 2025 • 14.6k • 241
facebook/dinov3-vits16-pretrain-lvd1689m

Image Feature Extraction • 21.6M • Updated Aug 19, 2025 • 359k • 144
facebook/dinov3-convnext-small-pretrain-lvd1689m

Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 21.5k • 31
facebook/dinov3-vitb16-pretrain-lvd1689m

Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 767k • 201

Meta CLIP 1

Scaling CLIP data with transparent training distribution from an end-to-end pipeline.

facebook/metaclip-h14-fullcc2.5b

Zero-Shot Image Classification • 1.0B • Updated Jan 11, 2024 • 6.77k • 49
facebook/metaclip-l14-fullcc2.5b

Zero-Shot Image Classification • Updated Oct 14, 2023 • 1.06k • 7
facebook/metaclip-b16-fullcc2.5b

Zero-Shot Image Classification • Updated Oct 14, 2023 • 2.16k • 11
facebook/metaclip-b32-fullcc2.5b

Zero-Shot Image Classification • Updated Oct 8, 2023 • 264 • 9

Web-SSL

facebook/webssl-dino300m-full2b-224

Image Feature Extraction • 0.3B • Updated Apr 24, 2025 • 3.98k • 12
facebook/webssl-dino1b-full2b-224

Image Feature Extraction • 1B • Updated Apr 24, 2025 • 1.95k • 3
facebook/webssl-dino2b-full2b-224

Image Feature Extraction • 2B • Updated Apr 24, 2025 • 20
facebook/webssl-dino3b-full2b-224

Image Feature Extraction • 3B • Updated Apr 24, 2025 • 706

Perception LM

facebook/Perception-LM-1B

Image-Text-to-Text • 2B • Updated Aug 13, 2025 • 2.11k • 45
facebook/Perception-LM-3B

Image-Text-to-Text • 4B • Updated Aug 13, 2025 • 18 • 24
facebook/Perception-LM-8B

Image-Text-to-Text • 10B • Updated Jul 14, 2025 • 675 • 67
facebook/PLM-VideoBench

Viewer • Updated May 21, 2025 • 44k • 655 • 13

FAIR Chemistry

facebook/OMAT24

Updated Oct 31, 2025 • 102
facebook/OMAT24

Preview • Updated Dec 11, 2025 • 268 • 75
facebook/OMol25

Updated Jul 1 • 248
facebook/UMA

Updated 17 days ago • 118 • 319

Meta Motivo

A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.

facebook/metamotivo-S-1

24.5M • Updated Dec 12, 2024 • 1.51k • 11
facebook/metamotivo-S-2

24.5M • Updated Dec 12, 2024 • 5 • 2
facebook/metamotivo-S-3

24.5M • Updated Dec 12, 2024 • 5 • 2
facebook/metamotivo-S-4

24.5M • Updated Dec 12, 2024 • 8 • 2

Sparsh

Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing

facebook/sparsh-dino-base

Updated Oct 21, 2024 • 6
facebook/sparsh-dino-small

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-base

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-small

Updated Oct 21, 2024 • 1

MelodyFlow

MelodyFlow: High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Paper • 2407.03648 • Published Jul 4, 2024 • 20
facebook/melodyflow-t24-30secs

Updated Oct 23, 2024 • 31
Running on Zero

Agents

166

MelodyFlow

🎵

166

Generate music from text and optional melody

MAGNeT

Masked Audio Generation using a Single Non-Autoregressive Transformer

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 44
facebook/magnet-small-10secs

Text-to-Audio • Updated Jan 16, 2024 • 144 • 25
facebook/magnet-medium-10secs

Text-to-Audio • Updated Jan 16, 2024 • 61 • 9
facebook/magnet-small-30secs

Text-to-Audio • Updated Jan 16, 2024 • 43 • 9

SeamlessM4T

SeamlessM4T is designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly.

Runtime error

Agents

Featured

949

Seamless M4T

📞

949
facebook/hf-seamless-m4t-large

Text-to-Speech • Updated Dec 8, 2023 • 6.73k • 62
facebook/hf-seamless-m4t-medium

Text-to-Speech • Updated Dec 8, 2023 • 105k • 32
facebook/seamless-m4t-large

Automatic Speech Recognition • Updated Dec 14, 2023 • 514

XLS-R

First release checkpoints for XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0.

facebook/wav2vec2-xls-r-300m

Updated Aug 10, 2022 • 1.55M • 128
facebook/wav2vec2-xls-r-1b

Updated Aug 10, 2022 • 7.01k • 34
facebook/wav2vec2-xls-r-2b

Updated Aug 10, 2022 • 1.83k • 46
facebook/wav2vec2-xls-r-300m-en-to-15

Automatic Speech Recognition • Updated Jan 26, 2023 • 3 • 6

VoxPopuli

A collection of open-source artefacts (datasets + checkpoints) from the first VoxPopuli release.

facebook/voxpopuli

Viewer • Updated Jan 30 • 1.26M • 23.2k • 157
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-100k-voxpopuli

Automatic Speech Recognition • Updated Nov 5, 2021 • 80 • 4
facebook/wav2vec2-base-10k-voxpopuli-ft-cs

Automatic Speech Recognition • Updated Jul 6, 2021 • 6

HuBERT

A collection of checkpoints from the HuBERT release, a speech encoder that learns powerful representations from unlabelled audio data.

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Paper • 2106.07447 • Published Jun 14, 2021 • 4
facebook/hubert-base-ls960

Feature Extraction • Updated Nov 5, 2021 • 892k • 74
facebook/hubert-large-ll60k

Feature Extraction • Updated Nov 5, 2021 • 74k • • 36
facebook/hubert-large-ls960-ft

Automatic Speech Recognition • Updated May 24, 2022 • 164k • 77

DINOv2

DINOv2: foundation models producing robust visual features suitable for image-level and pixel-level visual tasks - https://arxiv.org/abs/2304.07193

facebook/dinov2-small

Image Feature Extraction • 22.1M • Updated Sep 6, 2023 • 3.95M • 70
facebook/dinov2-base

Image Feature Extraction • 86.6M • Updated Jan 17, 2024 • 2.46M • 188
facebook/dinov2-large

Image Feature Extraction • 0.3B • Updated Sep 6, 2023 • 906k • 115
facebook/dinov2-giant

Image Feature Extraction • 1B • Updated Sep 6, 2023 • 273k • 62

LLM Compiler

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning.

facebook/llm-compiler-7b

Text Generation • Updated Jun 27, 2024 • 2 • 142
facebook/llm-compiler-7b-ftd

Text Generation • Updated Jun 27, 2024 • 25 • 29
facebook/llm-compiler-13b

Text Generation • Updated Jun 27, 2024 • 194 • 90
facebook/llm-compiler-13b-ftd

Text Generation • Updated Jun 27, 2024 • 59

Sapiens

Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22, 2024 • 93
facebook/sapiens

Updated Sep 20, 2024 • 817 • 245
Paused

Agents

59

Sapiens Pose

📊

59

Detect and estimate poses in images
Paused

Agents

122

Sapiens Segmentation

🌍

122

Segment body parts in images

FAIR's LayerSkip Llama models

facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 549 • 17
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 21 • 6
facebook/layerskip-codellama-7B

Text Generation • 7B • Updated Oct 19, 2024 • 31 • 6
facebook/layerskip-codellama-34B

Text Generation • 34B • Updated Oct 19, 2024 • 68 • 4

EUPE

facebook/EUPE-ViT-B

Updated Mar 26 • 10.4k • 27
facebook/EUPE-ViT-S

Updated Mar 26 • 3.67k • 10
facebook/EUPE-ViT-T

Updated Mar 27 • 3.85k • 7
facebook/EUPE-ConvNeXt-B

Updated Mar 27 • 66 • 7

MetaDepth

Monocular RGB to 3D Geometry Foundation Models

facebook/hyden-mogev2-metric-point

Depth Estimation • Updated May 20 • 1
facebook/hyden-da2-relative-depth

Depth Estimation • Updated Apr 24 • 8
facebook/hyden-mogev2-surface-normal

Image-to-Image • Updated May 20 • 1

sam-audio

facebook/sam-audio-bench

Viewer • Updated Dec 29, 2025 • 819 • 530 • 71
facebook/sam-audio-musdb18hq-test

Viewer • Updated Dec 17, 2025 • 150 • 298 • 8
facebook/sam-audio-judge

2B • Updated Dec 23, 2025 • 26.7k • 34
facebook/sam-audio-small

Updated Dec 30, 2025 • 4.18k • 91

MobileLLM-R1.5

facebook/MobileLLM-R1.5-950M

Text Generation • 0.9B • Updated Nov 24, 2025 • 181 • 19
facebook/MobileLLM-R1.5-360M

Text Generation • 0.4B • Updated Nov 24, 2025 • 9 • 8
facebook/MobileLLM-R1.5-140M

Text Generation • 0.1B • Updated Nov 24, 2025 • 123 • 11

SAM 3D Body

facebook/sam-3d-body-dataset

Viewer • Updated Nov 19, 2025 • 5.66M • 475 • 66
facebook/sam-3d-body-vith

Updated Dec 10, 2025 • 820 • 79
facebook/sam-3d-body-dinov3

Updated Dec 10, 2025 • 4.39k • 256

SAM3

facebook/SACo-Gold

Viewer • Updated Nov 17, 2025 • 21 • 155 • 23
facebook/SACo-Silver

Viewer • Updated Nov 17, 2025 • 9 • 54 • 7
facebook/SACo-VEval

Viewer • Updated Nov 17, 2025 • 2 • 93 • 4
facebook/SA-FARI

Updated Nov 19, 2025 • 66 • 24

cwm

Collection for Code World Model, an agentic coding model from FAIR.

facebook/cwm

33B • Updated Oct 15, 2025 • 14.8k • 271
facebook/cwm-sft

33B • Updated Oct 15, 2025 • 91 • 22
facebook/cwm-pretrain

33B • Updated Oct 15, 2025 • 141 • 23

Physics of Language Models: Part 4.2

facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-1T-lr0.002

Updated Dec 23, 2025 • 3
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-1T-lr0.003

Updated Dec 23, 2025 • 2
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-2T-lr0.003

Updated Dec 23, 2025 • 3
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-2T-lr0.005

Updated Dec 23, 2025 • 5

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann

facebook/vjepa2-vitl-fpc64-256

Video Classification • 0.3B • Updated Aug 11, 2025 • 300k • 204
facebook/vjepa2-vith-fpc64-256

Video Classification • 0.7B • Updated Aug 11, 2025 • 1.83k • 20
facebook/vjepa2-vitg-fpc64-256

Video Classification • 1B • Updated Aug 11, 2025 • 219k • 57
facebook/vjepa2-vitg-fpc64-384

Video Classification • 1B • Updated Aug 11, 2025 • 6.8k • 44

blt

facebook/blt

Updated Apr 30, 2025 • 4 • 75
facebook/blt-1b

5B • Updated May 1, 2025 • 457 • 24
facebook/blt-7b

11B • Updated May 1, 2025 • 17 • 62
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 109

Perception Encoder

facebook/PE-Core-L14-336

Zero-Shot Image Classification • Updated Apr 30, 2025 • 360k • 54
facebook/PE-Core-G14-448

Zero-Shot Image Classification • Updated Apr 30, 2025 • 15.8k • 24
facebook/PE-Lang-L14-448

Image Feature Extraction • Updated Apr 30, 2025 • 48 • 8
facebook/PE-Lang-G14-448

Image Feature Extraction • Updated Apr 30, 2025 • 539 • 14

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages.

facebook/drama-base

Sentence Similarity • 0.2B • Updated Jul 21, 2025 • 751 • 22
facebook/drama-large

Sentence Similarity • 0.4B • Updated Mar 4, 2025 • 120 • 8
facebook/drama-1b

Sentence Similarity • 1B • Updated Mar 4, 2025 • 136 • 15

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 135
facebook/MobileLLM-125M

Text Generation • Updated May 5, 2025 • 2.46k • 135
facebook/MobileLLM-350M

Text Generation • Updated May 5, 2025 • 89 • 36
facebook/MobileLLM-600M

Text Generation • Updated May 5, 2025 • 101 • 30

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 82
facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 549 • 17
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 21 • 6
facebook/layerskip-llama2-70B

Text Generation • 69B • Updated Nov 3, 2024 • 28 • 5

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation.

Seamless: Multilingual Expressive and Streaming Speech Translation

Paper • 2312.05187 • Published Dec 8, 2023 • 14
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 286k • 997
Runtime error

Featured

517

Seamless M4T v2

📞

517

Translate speech and text between languages
facebook/seamless-expressive

Text-to-Speech • Updated Jan 4, 2024 • 189

Wav2Vec 2.0

A collection for the first release of Wav2Vec 2.0, a speech encoder that learns powerful representations from unlabelled audio data.

facebook/wav2vec2-large-960h-lv60-self

Automatic Speech Recognition • Updated May 23, 2022 • 572k • 162
facebook/wav2vec2-large-960h

Automatic Speech Recognition • Updated Apr 5, 2022 • 18.2k • 35
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.78M • 401
facebook/wav2vec2-base-100h

Automatic Speech Recognition • Updated May 27, 2022 • 480 • 7

XLSR

A collection of multilingual Wav2Vec 2.0 checkpoints pre-trained on 53 languages and fine-tuned for CTC speech recognition.

facebook/wav2vec2-large-xlsr-53

Updated Mar 18, 2022 • 379k • 163
facebook/wav2vec2-xlsr-53-espeak-cv-ft

Automatic Speech Recognition • Updated Dec 10, 2021 • 444k • 50
facebook/wav2vec2-large-xlsr-53-dutch

Automatic Speech Recognition • Updated Jul 6, 2021 • 1.32k • 3
facebook/wav2vec2-large-xlsr-53-french

Automatic Speech Recognition • Updated Jul 6, 2021 • 518 • 13

Robust Wav2Vec 2.0

A collection of "robust" Wav2Vec 2.0 checkpoints pre-trained on datasets from multiple domains.

facebook/wav2vec2-large-robust

Updated Nov 5, 2021 • 3.29k • 41
facebook/wav2vec2-large-robust-ft-libri-960h

Automatic Speech Recognition • 0.3B • Updated Jun 23, 2023 • 61k • 17
facebook/wav2vec2-large-robust-ft-swbd-300h

Automatic Speech Recognition • Updated Apr 5, 2022 • 35.4k • 20
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

Paper • 2104.01027 • Published Apr 2, 2021 • 2

VoxPopuli v2

A collection of checkpoints from the second VoxPopuli release.

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-bg-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 9 • 2
facebook/wav2vec2-base-cs-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 2 • 1
facebook/wav2vec2-base-da-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 5

Fairseq S^2 TTS

Text-to-speech models from fairseq s^2

facebook/fastspeech2-en-ljspeech

Text-to-Speech • Updated Jan 28, 2022 • 56 • 272
facebook/fastspeech2-en-200_speaker-cv4

Text-to-Speech • Updated Jan 28, 2022 • 22 • 6
facebook/tts_transformer-ar-cv7

Text-to-Speech • Updated Jan 28, 2022 • 7 • 8
facebook/tts_transformer-vi-cv7

Text-to-Speech • Updated Jan 28, 2022 • 3 • 11

MusicGen Stereo

A collection of stereo music generation models as part of the v2 MusicGen release.

facebook/musicgen-stereo-small

Text-to-Audio • 0.6B • Updated Mar 6, 2024 • 1.87k • 45
facebook/musicgen-stereo-medium

Text-to-Audio • 2B • Updated Mar 6, 2024 • 1.82k • 35
facebook/musicgen-stereo-large

Text-to-Audio • 3B • Updated Mar 6, 2024 • 3.2k • 98
facebook/musicgen-stereo-melody-large

Text-to-Audio • 3B • Updated Apr 24, 2024 • 248 • 74

Chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

facebook/chameleon-7b

Image-Text-to-Text • 7B • Updated Jul 23, 2024 • 206k • 202
facebook/chameleon-30b

Image-Text-to-Text • 34B • Updated Jul 30, 2024 • 16 • 89

OPT

OPT (Open Pretrained Transformer) is a series of open-sourced large causal language models which perform similar in performance to GPT3.

facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 17.8M • 286
facebook/opt-350m

Text Generation • Updated Sep 15, 2023 • 288k • 150
facebook/opt-1.3b

Text Generation • Updated Sep 15, 2023 • 120k • 185
facebook/opt-2.7b

Text Generation • Updated Sep 15, 2023 • 35.2k • 89

Sapiens2

facebook/sapiens2

Updated May 15 • 156
Running on Zero

Agents

14

Sapiens2 Pose

🧬

14

Detect and visualize detailed human poses in images
Running on Zero

Agents

Featured

45

Sapiens2 Seg

🧩

45

Segment human body parts in an image (29 classes)
Running on Zero

Agents

Featured

21

Sapiens2 Normal

🧊

21

Generate per-pixel surface normal maps from photos

EUPE

facebook/EUPE-ViT-B

Updated Mar 26 • 10.4k • 27
facebook/EUPE-ViT-S

Updated Mar 26 • 3.67k • 10
facebook/EUPE-ViT-T

Updated Mar 27 • 3.85k • 7
facebook/EUPE-ConvNeXt-B

Updated Mar 27 • 66 • 7

LagerNVS

Latent Geometry for Fully Neural Real-time Novel View Synthesis

facebook/lagernvs_general_512

Updated May 21 • 1.79k • 30
facebook/lagernvs_re10k_2v_256

Updated Mar 26 • 128 • 6
facebook/lagernvs_dl3dv_2-6_v_256

Updated Mar 26 • 87 • 6

MetaDepth

Monocular RGB to 3D Geometry Foundation Models

facebook/hyden-mogev2-metric-point

Depth Estimation • Updated May 20 • 1
facebook/hyden-da2-relative-depth

Depth Estimation • Updated Apr 24 • 8
facebook/hyden-mogev2-surface-normal

Image-to-Image • Updated May 20 • 1

perception-encoder-audio-visual

facebook/pe-av-small

0.8B • Updated Jan 29 • 3.42k • 24
facebook/pe-av-base

1B • Updated Jan 29 • 507 • 13
facebook/pe-av-large

2B • Updated Jan 29 • 33.8k • 65
facebook/pe-av-small-16-frame

0.8B • Updated Jan 29 • 124 • 6

sam-audio

facebook/sam-audio-bench

Viewer • Updated Dec 29, 2025 • 819 • 530 • 71
facebook/sam-audio-musdb18hq-test

Viewer • Updated Dec 17, 2025 • 150 • 298 • 8
facebook/sam-audio-judge

2B • Updated Dec 23, 2025 • 26.7k • 34
facebook/sam-audio-small

Updated Dec 30, 2025 • 4.18k • 91

Pixio

facebook/pixio-vitb16

Image Feature Extraction • 85.9M • Updated Dec 26, 2025 • 458 • 15
facebook/pixio-vitl16

Image Feature Extraction • 0.3B • Updated Dec 26, 2025 • 14 • 9
facebook/pixio-vith16

Image Feature Extraction • 0.6B • Updated Dec 29, 2025 • 28 • 6
facebook/pixio-vit1b16

Image Feature Extraction • 1B • Updated Dec 26, 2025 • 1.39k • 8

MobileLLM-R1.5

facebook/MobileLLM-R1.5-950M

Text Generation • 0.9B • Updated Nov 24, 2025 • 181 • 19
facebook/MobileLLM-R1.5-360M

Text Generation • 0.4B • Updated Nov 24, 2025 • 9 • 8
facebook/MobileLLM-R1.5-140M

Text Generation • 0.1B • Updated Nov 24, 2025 • 123 • 11

Meta CLIP 2

facebook/metaclip-2-worldwide-huge-quickgelu

Zero-Shot Image Classification • 2B • Updated Aug 18, 2025 • 23.7k • 18
facebook/metaclip-2-worldwide-huge-378

Zero-Shot Image Classification • 2B • Updated Aug 18, 2025 • 13.2k • 7
facebook/metaclip-2-worldwide-giant

Zero-Shot Image Classification • 4B • Updated Aug 18, 2025 • 2.81k • 8
facebook/metaclip-2-worldwide-giant-378

Zero-Shot Image Classification • 4B • Updated Aug 18, 2025 • 23.9k • 13

SAM 3D Body

facebook/sam-3d-body-dataset

Viewer • Updated Nov 19, 2025 • 5.66M • 475 • 66
facebook/sam-3d-body-vith

Updated Dec 10, 2025 • 820 • 79
facebook/sam-3d-body-dinov3

Updated Dec 10, 2025 • 4.39k • 256

MobileLLM-Pro

facebook/MobileLLM-Pro

Text Generation • 1B • Updated Nov 11, 2025 • 132 • 164
facebook/MobileLLM-Pro-base

Text Generation • 1B • Updated Nov 11, 2025 • 5.01k • 11
facebook/MobileLLM-Pro-base-int4-cpu

Text Generation • Updated Nov 11, 2025 • 2 • 4
facebook/MobileLLM-Pro-base-int4-accelerator

Text Generation • Updated Nov 11, 2025 • 2 • 2

SAM3

facebook/SACo-Gold

Viewer • Updated Nov 17, 2025 • 21 • 155 • 23
facebook/SACo-Silver

Viewer • Updated Nov 17, 2025 • 9 • 54 • 7
facebook/SACo-VEval

Viewer • Updated Nov 17, 2025 • 2 • 93 • 4
facebook/SA-FARI

Updated Nov 19, 2025 • 66 • 24

MobileLLM-R1

MobileLLM-R1, a series of sub-billion parameter reasoning models

facebook/MobileLLM-R1-950M

Text Generation • 0.9B • Updated Sep 30, 2025 • 265 • 359
facebook/MobileLLM-R1-360M

Text Generation • 0.4B • Updated Nov 10, 2025 • 190 • 23
facebook/MobileLLM-R1-140M

Text Generation • 0.1B • Updated Nov 10, 2025 • 117 • 37
facebook/MobileLLM-R1-950M-base

Text Generation • 0.9B • Updated Sep 30, 2025 • 29 • 20

cwm

Collection for Code World Model, an agentic coding model from FAIR.

facebook/cwm

33B • Updated Oct 15, 2025 • 14.8k • 271
facebook/cwm-sft

33B • Updated Oct 15, 2025 • 91 • 22
facebook/cwm-pretrain

33B • Updated Oct 15, 2025 • 141 • 23

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19, 2025 • 14.6k • 241
facebook/dinov3-vits16-pretrain-lvd1689m

Image Feature Extraction • 21.6M • Updated Aug 19, 2025 • 359k • 144
facebook/dinov3-convnext-small-pretrain-lvd1689m

Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 21.5k • 31
facebook/dinov3-vitb16-pretrain-lvd1689m

Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 767k • 201

Physics of Language Models: Part 4.2

facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-1T-lr0.002

Updated Dec 23, 2025 • 3
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-1T-lr0.003

Updated Dec 23, 2025 • 2
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-2T-lr0.003

Updated Dec 23, 2025 • 3
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-2T-lr0.005

Updated Dec 23, 2025 • 5

Meta CLIP 1

Scaling CLIP data with transparent training distribution from an end-to-end pipeline.

facebook/metaclip-h14-fullcc2.5b

Zero-Shot Image Classification • 1.0B • Updated Jan 11, 2024 • 6.77k • 49
facebook/metaclip-l14-fullcc2.5b

Zero-Shot Image Classification • Updated Oct 14, 2023 • 1.06k • 7
facebook/metaclip-b16-fullcc2.5b

Zero-Shot Image Classification • Updated Oct 14, 2023 • 2.16k • 11
facebook/metaclip-b32-fullcc2.5b

Zero-Shot Image Classification • Updated Oct 8, 2023 • 264 • 9

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann

facebook/vjepa2-vitl-fpc64-256

Video Classification • 0.3B • Updated Aug 11, 2025 • 300k • 204
facebook/vjepa2-vith-fpc64-256

Video Classification • 0.7B • Updated Aug 11, 2025 • 1.83k • 20
facebook/vjepa2-vitg-fpc64-256

Video Classification • 1B • Updated Aug 11, 2025 • 219k • 57
facebook/vjepa2-vitg-fpc64-384

Video Classification • 1B • Updated Aug 11, 2025 • 6.8k • 44

Web-SSL

facebook/webssl-dino300m-full2b-224

Image Feature Extraction • 0.3B • Updated Apr 24, 2025 • 3.98k • 12
facebook/webssl-dino1b-full2b-224

Image Feature Extraction • 1B • Updated Apr 24, 2025 • 1.95k • 3
facebook/webssl-dino2b-full2b-224

Image Feature Extraction • 2B • Updated Apr 24, 2025 • 20
facebook/webssl-dino3b-full2b-224

Image Feature Extraction • 3B • Updated Apr 24, 2025 • 706

blt

facebook/blt

Updated Apr 30, 2025 • 4 • 75
facebook/blt-1b

5B • Updated May 1, 2025 • 457 • 24
facebook/blt-7b

11B • Updated May 1, 2025 • 17 • 62
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 109

Perception LM

facebook/Perception-LM-1B

Image-Text-to-Text • 2B • Updated Aug 13, 2025 • 2.11k • 45
facebook/Perception-LM-3B

Image-Text-to-Text • 4B • Updated Aug 13, 2025 • 18 • 24
facebook/Perception-LM-8B

Image-Text-to-Text • 10B • Updated Jul 14, 2025 • 675 • 67
facebook/PLM-VideoBench

Viewer • Updated May 21, 2025 • 44k • 655 • 13

Perception Encoder

facebook/PE-Core-L14-336

Zero-Shot Image Classification • Updated Apr 30, 2025 • 360k • 54
facebook/PE-Core-G14-448

Zero-Shot Image Classification • Updated Apr 30, 2025 • 15.8k • 24
facebook/PE-Lang-L14-448

Image Feature Extraction • Updated Apr 30, 2025 • 48 • 8
facebook/PE-Lang-G14-448

Image Feature Extraction • Updated Apr 30, 2025 • 539 • 14

FAIR Chemistry

facebook/OMAT24

Updated Oct 31, 2025 • 102
facebook/OMAT24

Preview • Updated Dec 11, 2025 • 268 • 75
facebook/OMol25

Updated Jul 1 • 248
facebook/UMA

Updated 17 days ago • 118 • 319

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages.

facebook/drama-base

Sentence Similarity • 0.2B • Updated Jul 21, 2025 • 751 • 22
facebook/drama-large

Sentence Similarity • 0.4B • Updated Mar 4, 2025 • 120 • 8
facebook/drama-1b

Sentence Similarity • 1B • Updated Mar 4, 2025 • 136 • 15

Meta Motivo

A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.

facebook/metamotivo-S-1

24.5M • Updated Dec 12, 2024 • 1.51k • 11
facebook/metamotivo-S-2

24.5M • Updated Dec 12, 2024 • 5 • 2
facebook/metamotivo-S-3

24.5M • Updated Dec 12, 2024 • 5 • 2
facebook/metamotivo-S-4

24.5M • Updated Dec 12, 2024 • 8 • 2

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 135
facebook/MobileLLM-125M

Text Generation • Updated May 5, 2025 • 2.46k • 135
facebook/MobileLLM-350M

Text Generation • Updated May 5, 2025 • 89 • 36
facebook/MobileLLM-600M

Text Generation • Updated May 5, 2025 • 101 • 30

Sparsh

Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing

facebook/sparsh-dino-base

Updated Oct 21, 2024 • 6
facebook/sparsh-dino-small

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-base

Updated Oct 21, 2024 • 1
facebook/sparsh-mae-small

Updated Oct 21, 2024 • 1

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 82
facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 549 • 17
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 21 • 6
facebook/layerskip-llama2-70B

Text Generation • 69B • Updated Nov 3, 2024 • 28 • 5

MelodyFlow

MelodyFlow: High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Paper • 2407.03648 • Published Jul 4, 2024 • 20
facebook/melodyflow-t24-30secs

Updated Oct 23, 2024 • 31
Running on Zero

Agents

166

MelodyFlow

🎵

166

Generate music from text and optional melody

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation.

Seamless: Multilingual Expressive and Streaming Speech Translation

Paper • 2312.05187 • Published Dec 8, 2023 • 14
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 286k • 997
Runtime error

Featured

517

Seamless M4T v2

📞

517

Translate speech and text between languages
facebook/seamless-expressive

Text-to-Speech • Updated Jan 4, 2024 • 189

MAGNeT

Masked Audio Generation using a Single Non-Autoregressive Transformer

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 44
facebook/magnet-small-10secs

Text-to-Audio • Updated Jan 16, 2024 • 144 • 25
facebook/magnet-medium-10secs

Text-to-Audio • Updated Jan 16, 2024 • 61 • 9
facebook/magnet-small-30secs

Text-to-Audio • Updated Jan 16, 2024 • 43 • 9

Wav2Vec 2.0

A collection for the first release of Wav2Vec 2.0, a speech encoder that learns powerful representations from unlabelled audio data.

facebook/wav2vec2-large-960h-lv60-self

Automatic Speech Recognition • Updated May 23, 2022 • 572k • 162
facebook/wav2vec2-large-960h

Automatic Speech Recognition • Updated Apr 5, 2022 • 18.2k • 35
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.78M • 401
facebook/wav2vec2-base-100h

Automatic Speech Recognition • Updated May 27, 2022 • 480 • 7

SeamlessM4T

SeamlessM4T is designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly.

Runtime error

Agents

Featured

949

Seamless M4T

📞

949
facebook/hf-seamless-m4t-large

Text-to-Speech • Updated Dec 8, 2023 • 6.73k • 62
facebook/hf-seamless-m4t-medium

Text-to-Speech • Updated Dec 8, 2023 • 105k • 32
facebook/seamless-m4t-large

Automatic Speech Recognition • Updated Dec 14, 2023 • 514

XLSR

A collection of multilingual Wav2Vec 2.0 checkpoints pre-trained on 53 languages and fine-tuned for CTC speech recognition.

facebook/wav2vec2-large-xlsr-53

Updated Mar 18, 2022 • 379k • 163
facebook/wav2vec2-xlsr-53-espeak-cv-ft

Automatic Speech Recognition • Updated Dec 10, 2021 • 444k • 50
facebook/wav2vec2-large-xlsr-53-dutch

Automatic Speech Recognition • Updated Jul 6, 2021 • 1.32k • 3
facebook/wav2vec2-large-xlsr-53-french

Automatic Speech Recognition • Updated Jul 6, 2021 • 518 • 13

XLS-R

First release checkpoints for XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0.

facebook/wav2vec2-xls-r-300m

Updated Aug 10, 2022 • 1.55M • 128
facebook/wav2vec2-xls-r-1b

Updated Aug 10, 2022 • 7.01k • 34
facebook/wav2vec2-xls-r-2b

Updated Aug 10, 2022 • 1.83k • 46
facebook/wav2vec2-xls-r-300m-en-to-15

Automatic Speech Recognition • Updated Jan 26, 2023 • 3 • 6

Robust Wav2Vec 2.0

A collection of "robust" Wav2Vec 2.0 checkpoints pre-trained on datasets from multiple domains.

facebook/wav2vec2-large-robust

Updated Nov 5, 2021 • 3.29k • 41
facebook/wav2vec2-large-robust-ft-libri-960h

Automatic Speech Recognition • 0.3B • Updated Jun 23, 2023 • 61k • 17
facebook/wav2vec2-large-robust-ft-swbd-300h

Automatic Speech Recognition • Updated Apr 5, 2022 • 35.4k • 20
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

Paper • 2104.01027 • Published Apr 2, 2021 • 2

VoxPopuli

A collection of open-source artefacts (datasets + checkpoints) from the first VoxPopuli release.

facebook/voxpopuli

Viewer • Updated Jan 30 • 1.26M • 23.2k • 157
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-100k-voxpopuli

Automatic Speech Recognition • Updated Nov 5, 2021 • 80 • 4
facebook/wav2vec2-base-10k-voxpopuli-ft-cs

Automatic Speech Recognition • Updated Jul 6, 2021 • 6

VoxPopuli v2

A collection of checkpoints from the second VoxPopuli release.

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Paper • 2101.00390 • Published Jan 2, 2021 • 1
facebook/wav2vec2-base-bg-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 9 • 2
facebook/wav2vec2-base-cs-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 2 • 1
facebook/wav2vec2-base-da-voxpopuli-v2

Automatic Speech Recognition • Updated Feb 27, 2022 • 5

HuBERT

A collection of checkpoints from the HuBERT release, a speech encoder that learns powerful representations from unlabelled audio data.

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Paper • 2106.07447 • Published Jun 14, 2021 • 4
facebook/hubert-base-ls960

Feature Extraction • Updated Nov 5, 2021 • 892k • 74
facebook/hubert-large-ll60k

Feature Extraction • Updated Nov 5, 2021 • 74k • • 36
facebook/hubert-large-ls960-ft

Automatic Speech Recognition • Updated May 24, 2022 • 164k • 77

Fairseq S^2 TTS

Text-to-speech models from fairseq s^2

facebook/fastspeech2-en-ljspeech

Text-to-Speech • Updated Jan 28, 2022 • 56 • 272
facebook/fastspeech2-en-200_speaker-cv4

Text-to-Speech • Updated Jan 28, 2022 • 22 • 6
facebook/tts_transformer-ar-cv7

Text-to-Speech • Updated Jan 28, 2022 • 7 • 8
facebook/tts_transformer-vi-cv7

Text-to-Speech • Updated Jan 28, 2022 • 3 • 11

DINOv2

DINOv2: foundation models producing robust visual features suitable for image-level and pixel-level visual tasks - https://arxiv.org/abs/2304.07193

facebook/dinov2-small

Image Feature Extraction • 22.1M • Updated Sep 6, 2023 • 3.95M • 70
facebook/dinov2-base

Image Feature Extraction • 86.6M • Updated Jan 17, 2024 • 2.46M • 188
facebook/dinov2-large

Image Feature Extraction • 0.3B • Updated Sep 6, 2023 • 906k • 115
facebook/dinov2-giant

Image Feature Extraction • 1B • Updated Sep 6, 2023 • 273k • 62

MusicGen Stereo

A collection of stereo music generation models as part of the v2 MusicGen release.

facebook/musicgen-stereo-small

Text-to-Audio • 0.6B • Updated Mar 6, 2024 • 1.87k • 45
facebook/musicgen-stereo-medium

Text-to-Audio • 2B • Updated Mar 6, 2024 • 1.82k • 35
facebook/musicgen-stereo-large

Text-to-Audio • 3B • Updated Mar 6, 2024 • 3.2k • 98
facebook/musicgen-stereo-melody-large

Text-to-Audio • 3B • Updated Apr 24, 2024 • 248 • 74

LLM Compiler

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning.

facebook/llm-compiler-7b

Text Generation • Updated Jun 27, 2024 • 2 • 142
facebook/llm-compiler-7b-ftd

Text Generation • Updated Jun 27, 2024 • 25 • 29
facebook/llm-compiler-13b

Text Generation • Updated Jun 27, 2024 • 194 • 90
facebook/llm-compiler-13b-ftd

Text Generation • Updated Jun 27, 2024 • 59

Chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

facebook/chameleon-7b

Image-Text-to-Text • 7B • Updated Jul 23, 2024 • 206k • 202
facebook/chameleon-30b

Image-Text-to-Text • 34B • Updated Jul 30, 2024 • 16 • 89

Sapiens

Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22, 2024 • 93
facebook/sapiens

Updated Sep 20, 2024 • 817 • 245
Paused

Agents

59

Sapiens Pose

📊

59

Detect and estimate poses in images
Paused

Agents

122

Sapiens Segmentation

🌍

122

Segment body parts in images

OPT

OPT (Open Pretrained Transformer) is a series of open-sourced large causal language models which perform similar in performance to GPT3.

facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 17.8M • 286
facebook/opt-350m

Text Generation • Updated Sep 15, 2023 • 288k • 150
facebook/opt-1.3b

Text Generation • Updated Sep 15, 2023 • 120k • 185
facebook/opt-2.7b

Text Generation • Updated Sep 15, 2023 • 35.2k • 89

FAIR's LayerSkip Llama models

facebook/layerskip-llama2-7B

Text Generation • 7B • Updated Oct 19, 2024 • 549 • 17
facebook/layerskip-llama2-13B

Text Generation • 13B • Updated Oct 19, 2024 • 21 • 6
facebook/layerskip-codellama-7B

Text Generation • 7B • Updated Oct 19, 2024 • 31 • 6
facebook/layerskip-codellama-34B

Text Generation • 34B • Updated Oct 19, 2024 • 68 • 4

AI & ML interests

Recent Activity

Papers

Team members 309

facebook 's collections 47

Sapiens2 Pose

Sapiens2 Seg

Sapiens2 Normal

MelodyFlow

Seamless M4T

Sapiens Pose

Sapiens Segmentation