A collection with text-classification and token-classification models for PII Protection
Alvaro Bartolome
AI & ML interests
machine learning @huggingface (inference + cloud)
Recent Activity
posted an
update
2 days ago
Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! 💥
> 🕒 60-minute single-pass processing, no chunking or stitching
> 👤 Customized hotwords to guide recognition on domain-specific content
> 📝 Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> 🤗 Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API
https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr new activity
2 days ago
huggingface/documentation-images:Add images for VibeVoice ASR example liked
a model 3 days ago
microsoft/VibeVoice-ASR-HF Organizations
Critique Models (CM) on the 🤗 Hub
This collection contains some Critique Models (CM) for LLM evaluation available in the HuggingFace Hub
AIF Datasets (with distilabel)
Small to medium size datasets either: synthetically generated, labelled with AI Feedback (AIF), or both
NER in Spanish
Fine-tuned models to perform NER in Spanish using the framework SpanMarker and different encoders and datasets
From zero to GPT-hero
Reading list to fully understand GPT (and GPT-2) and to be able to implement those from scratch
-
Neural Machine Translation of Rare Words with Subword Units
Paper • 1508.07909 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
Generating Wikipedia by Summarizing Long Sequences
Paper • 1801.10198 • Published • 3
Studio Ghibli Diffusion
Text-To-Image fine-tunes with Studio Ghibli style
- Running on Zero22
FLUX.1 Studio Ghibli LoRA
🖼22Generate Studio Ghibli-style images from text prompts
-
alvarobartt/ghibli-characters
Viewer • Updated • 9 • 53 • 9 -
black-forest-labs/FLUX.1-dev
Text-to-Image • Updated • 729k • • 12.4k -
alvarobartt/ghibli-characters-flux-lora
Text-to-Image • Updated • 96 • • 64
About ORPO
Contains some information and experiments fine-tuning LLMs using 🤗 `trl.ORPOTrainer`
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 72 -
HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
Text Generation • 141B • Updated • 202 • 269 -
alvarobartt/mistral-orpo-mix
Text Generation • 7B • Updated • 4 • 1 -
alvarobartt/Mistral-7B-v0.1-ORPO
Text Generation • 7B • Updated • 14 • 14
Apple MLX-compatible 7B LLMs on the 🤗 Hub
This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx
🇪🇸 Datasets in Spanish for LLM Evaluation
This collection contains some datasets for LLM evaluation in Spanish, from nlp.uoregon.edu, translated using ChatGPT (including English counterparts)
🔒 Models for PII Protection
A collection with text-classification and token-classification models for PII Protection
Studio Ghibli Diffusion
Text-To-Image fine-tunes with Studio Ghibli style
- Running on Zero22
FLUX.1 Studio Ghibli LoRA
🖼22Generate Studio Ghibli-style images from text prompts
-
alvarobartt/ghibli-characters
Viewer • Updated • 9 • 53 • 9 -
black-forest-labs/FLUX.1-dev
Text-to-Image • Updated • 729k • • 12.4k -
alvarobartt/ghibli-characters-flux-lora
Text-to-Image • Updated • 96 • • 64
Critique Models (CM) on the 🤗 Hub
This collection contains some Critique Models (CM) for LLM evaluation available in the HuggingFace Hub
About ORPO
Contains some information and experiments fine-tuning LLMs using 🤗 `trl.ORPOTrainer`
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 72 -
HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
Text Generation • 141B • Updated • 202 • 269 -
alvarobartt/mistral-orpo-mix
Text Generation • 7B • Updated • 4 • 1 -
alvarobartt/Mistral-7B-v0.1-ORPO
Text Generation • 7B • Updated • 14 • 14
AIF Datasets (with distilabel)
Small to medium size datasets either: synthetically generated, labelled with AI Feedback (AIF), or both
Apple MLX-compatible 7B LLMs on the 🤗 Hub
This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx
NER in Spanish
Fine-tuned models to perform NER in Spanish using the framework SpanMarker and different encoders and datasets
🇪🇸 Datasets in Spanish for LLM Evaluation
This collection contains some datasets for LLM evaluation in Spanish, from nlp.uoregon.edu, translated using ChatGPT (including English counterparts)
From zero to GPT-hero
Reading list to fully understand GPT (and GPT-2) and to be able to implement those from scratch
-
Neural Machine Translation of Rare Words with Subword Units
Paper • 1508.07909 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
Generating Wikipedia by Summarizing Long Sequences
Paper • 1801.10198 • Published • 3