Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
107.6
TFLOPS
15
1
11
MiJa
snapo
Follow
xszheng2020's profile picture
alekan's profile picture
21world's profile picture
4 followers
·
29 following
AI & ML interests
None yet
Recent Activity
new
activity
1 day ago
deepseek-ai/DeepSeek-V3.1:
This model’s censorship is insane
liked
a model
1 day ago
deepseek-ai/DeepSeek-V3.1
reacted
to
sweatSmile
's
post
with 🚀
13 days ago
Teaching a 7B Model to Be Just the Right Amount of Snark Ever wondered if a language model could get sarcasm? I fine-tuned Mistral-7B using LoRA and 4-bit quantisation—on just ~720 hand-picked sarcastic prompt–response pairs from Reddit, Twitter, and real-life conversations. The challenge? Keeping it sarcastic but still helpful. LoRA rank 16 to avoid overfitting 4-bit NF4 quantization to fit on limited GPU memory 10 carefully monitored epochs so it didn’t turn into a full-time comedian Result: a model that understands “Oh great, another meeting” exactly as you mean it. Read the full journey, tech details, and lessons learned on my blog: Fine-Tuning Mistral-7B for Sarcasm with LoRA and 4-Bit Quantisation Try the model here on Hugging Face: sweatSmile/Mistral-7B-Instruct-v0.1-Sarcasm.
View all activity
Organizations
None yet
snapo
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
1 day ago
deepseek-ai/DeepSeek-V3.1
Text Generation
•
685B
•
Updated
about 13 hours ago
•
7.8k
•
•
434
liked
3 models
about 1 month ago
Kwaipilot/KAT-V1-40B
Text Generation
•
41B
•
Updated
Jul 21
•
1.45k
•
105
Qwen/Qwen3-235B-A22B-Instruct-2507
Text Generation
•
235B
•
Updated
6 days ago
•
85k
•
•
649
HuggingFaceTB/SmolLM3-3B
Text Generation
•
3B
•
Updated
8 days ago
•
491k
•
•
665
liked
a model
4 months ago
Qwen/Qwen3-235B-A22B
Text Generation
•
235B
•
Updated
28 days ago
•
142k
•
•
1.03k
liked
a model
6 months ago
BlinkDL/rwkv7-g1
Text Generation
•
Updated
about 6 hours ago
•
108
liked
2 datasets
7 months ago
ServiceNow-AI/R1-Distill-SFT
Viewer
•
Updated
Feb 8
•
1.85M
•
7.45k
•
305
QuixiAI/dolphin-r1
Viewer
•
Updated
Jan 30
•
814k
•
648
•
285
liked
2 models
7 months ago
deepseek-ai/DeepSeek-R1
Text Generation
•
685B
•
Updated
Mar 27
•
747k
•
•
12.6k
deepseek-ai/DeepSeek-R1-Zero
Text Generation
•
685B
•
Updated
Mar 27
•
1.74k
•
935
liked
a model
9 months ago
Qwen/Qwen2.5-Coder-32B
Text Generation
•
33B
•
Updated
Nov 18, 2024
•
4.55k
•
•
134