AI & ML interests

None defined yet.

Recent Activity

renll  updated a model about 7 hours ago
microsoft/Phi-4-mini-flash-reasoning
tnaumann  updated a collection about 7 hours ago
Phi-4
renll  published a model about 11 hours ago
microsoft/Phi-4-mini-flash-reasoning
View all activity

microsoft 's collections 21

Phi-3
Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths.
Controllable Safety Alignment
Artifacts for the paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968)
SpeechT5
The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks.
Table Transformer
The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images.
TAPEX
TAPEX is the state-of-the-art table pre-training models which can be used for table-based question answering and table-based fact verification.
GIT
GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering.
Phi-3
Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths.
Controllable Safety Alignment
Artifacts for the paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968)
SpeechT5
The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks.
TAPEX
TAPEX is the state-of-the-art table pre-training models which can be used for table-based question answering and table-based fact verification.
Table Transformer
The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images.
GIT
GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering.