hanzlajavaid's picture

hanzlajavaid

hanzla

AI & ML interests

Direct Preference Optimization, Supervised Finetuning, Stable Diffusion

Recent Activity

liked a model about 6 hours ago
nvidia/Llama-3.1-8B-UltraLong-4M-Instruct
published a Space 12 days ago
hanzla/VR_Experience_Report
updated a Space 12 days ago
hanzla/VR_Experience_Report
View all activity

Organizations

ZeroGPU Explorers's profile picture Journalists on Hugging Face's profile picture MLX Community's profile picture ModularityAI's profile picture Social Post Explorers's profile picture

hanzla's activity

New activity in hanzla/Falcon3-Mamba-R1-v0 18 days ago

Ollama support

1
#1 opened 19 days ago by
ayan4m1
posted an update 23 days ago
view post
Post
1985
Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner
replied to Jaward's post 24 days ago
reacted to Jaward's post with 🔥 24 days ago