hanzlajavaid's picture

hanzlajavaid

hanzla

AI & ML interests

Direct Preference Optimization, Supervised Finetuning, Stable Diffusion

Recent Activity

liked a model about 6 hours ago
nvidia/Llama-3.1-8B-UltraLong-4M-Instruct
published a Space 12 days ago
hanzla/VR_Experience_Report
updated a Space 12 days ago
hanzla/VR_Experience_Report
View all activity

Organizations

ZeroGPU Explorers's profile picture Journalists on Hugging Face's profile picture MLX Community's profile picture ModularityAI's profile picture Social Post Explorers's profile picture

hanzla's activity

posted an update 23 days ago
view post
Post
1985
Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner
replied to Jaward's post 24 days ago
reacted to Jaward's post with 🔥 24 days ago
replied to their post 28 days ago
reacted to clem's post with 🤗 28 days ago
view post
Post
4636
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
·
reacted to reddgr's post with 👍 29 days ago
reacted to AtAndDev's post with 🔥 29 days ago
view post
Post
4204
There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...
  • 6 replies
·
reacted to fdaudens's post with 👍 29 days ago
view post
Post
2307
Want to build useful newsroom tools with AI? We’re launching a Hugging Face x Journalism Slack channel where journalists turn AI concepts into real newsroom solutions.

Inside the community:
✅ Build open-source AI tools for journalism
✅ Get direct help from the community
✅ Stay updated on new models and datasets
✅ Learn from other journalists’ experiments and builds

The goal? Go from “I read about AI” to “I built an AI tool that supercharged my newsroom.” —no more learning in isolation.

Join us! https://join.slack.com/t/journalistson-tnd8294/shared_invite/zt-30vsmhk4w-dZpeMOoxdhCvfNsqtspPUQ (Please make sure to use a clear identity—no teddybear85, for example 😉)

(If you know people who might be interested, tag them below! The more minds we bring in, the better the tools we build.)

reacted to mrfakename's post with 🚀 29 days ago
reacted to their post with 👍 about 1 month ago
view post
Post
3925
Hello community,

I want to share my work of creating a reasoning mamba model

I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.

Give it a try:

Model repo: hanzla/Falcon3-Mamba-R1-v0

Space: hanzla/Falcon3MambaReasoner

Looking forward to community feedback.
  • 2 replies
·
posted an update about 1 month ago
view post
Post
3925
Hello community,

I want to share my work of creating a reasoning mamba model

I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.

Give it a try:

Model repo: hanzla/Falcon3-Mamba-R1-v0

Space: hanzla/Falcon3MambaReasoner

Looking forward to community feedback.
  • 2 replies
·
reacted to AtAndDev's post with 🔥 about 1 month ago
view post
Post
1599
Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.
posted an update about 1 month ago
view post
Post
1253
Gemma 3 is a game changer for on device multimodal applications.

Try for yourself how a 4 billion parameter model can be so good.

hanzla/PlaygroundGemma3
  • 1 reply
·
posted an update 11 months ago
reacted to mrm8488's post with 🚀 11 months ago
view post
Post
6679
Working on a concept GPT-2 (small) that uses KANs instead of MLPs.
The ckpt and training code will be soon on the hub.
·