arbml (Arabic Machine Learning )

Nymbo

posted an update 4 days ago

Post

455

Anyone using Jan-v1-4B for local MCP-based web search, I highly recommend you try out Intelligent-Internet/II-Search-4B

Very impressed with this lil guy and it deserves more downloads. It's based on the original version of Qwen3-4B but find that it questions reality way less often. Jan-v1 seems to think that everything it sees is synthetic data and constantly gaslights me

alielfilali01

posted an update 15 days ago

Post

455

Guys WTH is "yofo-*" ???
Most OpenAI staff associated with the openai/gpt-oss-68911959590a1634ba11c7a4 release are affiliated to dozens of yofo orgs ...

i.e

yofo-wildflower

Some HF folks as well 👀

SaiedAlshahrani

updated a dataset about 2 months ago

arbml/CIDAR

Viewer • Updated Jul 1 • 10k • 148 • 51

SaiedAlshahrani

in arbml/CIDAR about 2 months ago

Update README.md

1

#3 opened about 2 months ago by

SaiedAlshahrani

Zaid

in arbml/CIDAR about 2 months ago

Update README.md

1

#3 opened about 2 months ago by

SaiedAlshahrani

Nymbo

posted an update about 2 months ago

Post

2726

Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?

4 replies

·

alielfilali01

authored a paper 3 months ago

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi

Paper • 2504.06011 • Published Apr 8 • 2

Nymbo

posted an update 3 months ago

Post

4089

Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

1 reply

·

Nymbo

posted an update 4 months ago

Post

2764

PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.

alielfilali01

posted an update 4 months ago

Post

824

Great efforts from @AtlasIA folks to adapt text2image models (ghibli style) for Moroccan Context

Read the blog is here : https://huggingface.co/blog/atlasia/creating-your-custom-ghibli-text-to-image-model

nouamanetazi

authored a paper 4 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 197

Zaid

updated a dataset 5 months ago

arbml/CIDAR

Viewer • Updated Jul 1 • 10k • 148 • 51

AhmadMustafa

authored a paper 5 months ago

On the Limitations of Vision-Language Models in Understanding Image Transforms

Paper • 2503.09837 • Published Mar 12 • 10

alielfilali01

posted an update 6 months ago

Post

1075

🚨 Arabic LLM Evaluation 🚨

Few models join the ranking of https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard Today.

The new MistralAI model, Saba, is quite impressive, Top10 ! Well done @arthurmensch and team.

Sadly Mistral did not follow its strategy about public weights this time, we hope this changes soon and we get the model with a permissive license.

We added other Mistral models and apparently, we have been sleeping on mistralai/Mistral-Large-Instruct-2411 !

Another impressive model that joined the ranking today is ALLaM-AI/ALLaM-7B-Instruct-preview. After a long wait finally ALLaM is here and it is IMPRESSIVE given its size !

ALLaM is ranked on OALL/Open-Arabic-LLM-Leaderboard as well.

alielfilali01

posted an update 8 months ago

Post

2153

3C3H AraGen Leaderboard welcomes today deepseek-ai/DeepSeek-V3 and 12 other models (including the late gpt-3.5 💀) to the ranking of best LLMs in Arabic !

Observations:
- DeepSeek-v3 ranked 3rd and only Open model among the top 5 !

- A 14B open model ( Qwen/Qwen2.5-14B-Instruct) outperforms gpt-3.5-turbo-0125 (from last year). This shows how much we came in advancing and supporting Arabic presence within the LLM ecosystem !

- Contrary to what observed in likelihood-acc leaderboards (like OALL/Open-Arabic-LLM-Leaderboard) further finetuned models like maldv/Qwentile2.5-32B-Instruct actually decreased the performance compared to the original model Qwen/Qwen2.5-32B-Instruct.
It's worth to note that the decrease is statiscally insignificant which imply that at best, the out-domain finetuning do not really hurts the model original capabilities acquired during pretraining.
Previous work addressed this (finetuning VS pretraining) but more investigation in this regard is required (any PhDs here ? This could be your question ...)

Check out the latest rankings: https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard

AhmadMustafa

authored a paper 8 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 10

alielfilali01

posted an update 8 months ago

Post

2064

~75% on the challenging GPQA with only 40M parameters 🔥🥳

GREAT ACHIEVEMENT ! Or is it ?

This new Work, "Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation", take out the mystery about many models i personally suspected their results. Speacially on leaderboards other than the english one, Like the Open Arabic LLM Leaderbaord OALL/Open-Arabic-LLM-Leaderboard.

The authors of this work, first started by training a model on the GPQA data, which, unsurprisingly, led to the model achieving 100% performance.

Afterward, they trained what they referred to as a 'legitimate' model on legitimate data (MedMCQA). However, they introduced a distillation loss from the earlier, 'cheated' model.

What they discovered was fascinating: the knowledge of GPQA leaked through this distillation loss, even though the legitimate model was never explicitly trained on GPQA during this stage.

This raises important questions about the careful use of distillation in model training, especially when the training data is opaque. As they demonstrated, it’s apparently possible to (intentionally or unintentionally) leak test data through this method.

Find out more: Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation (2412.15255)

1 reply

·

Zaid

updated a dataset 8 months ago

arbml/masader

Updated Dec 21, 2024 • 149 • 10

alielfilali01

posted an update 8 months ago

Post

3556

Unpopular opinion: Open Source takes courage to do !

Not everyone is brave enough to release what they have done (the way they've done it) to the wild to be judged !
It really requires a high level of "knowing wth are you doing" ! It's kind of a super power !

Cheers to the heroes here who see this!

5 replies

·

alielfilali01

posted an update 8 months ago

Post

1604

Apparently i forgot to put this here !

Well, this is a bit late but consider given our recent blog a read if you are interested in Evaluation.

You don't have to be into Arabic NLP in order to read it, the main contribution we are introducing is a new evaluation measure for NLG. We made the fisrt application of this measure on Arabic for now and we will be working with colleagues from the community to expand it to other languages.

Blog:
Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard
https://huggingface.co/blog/leaderboard-3c3h-aragen

Space:
https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard

Give it a read and let me know your thoughts 🤗

Arabic Machine Learning

AI & ML interests

Recent Activity

arbml/CIDAR

Update README.md

Update README.md

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi

SmolVLM: Redefining small and efficient multimodal models

arbml/CIDAR

On the Limitations of Vision-Language Models in Understanding Image Transforms

Bridging the Data Provenance Gap Across Text, Speech and Video

arbml/masader

AI & ML interests

Recent Activity

Team members 298

arbml's activity

Update README.md

Update README.md