musika (Musika)

AtAndDev

posted an update 9 days ago

Post

276

Qwen 3 Coder is a personal attack to k2, and I love it.
It achieves near SOTA on LCB while not having reasoning.
Finally people are understanding that reasoning isnt necessary for high benches...

Qwen ftw!

DECENTRALIZE DECENTRALIZE DECENTRALIZE

Nymbo

posted an update about 1 month ago

Post

2238

Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?

3 replies

·

AtAndDev

posted an update 2 months ago

Post

2895

deepseek-ai/DeepSeek-R1-0528

This is the end

1 reply

·

1024m

authored a paper 2 months ago

Uncovering Cultural Representation Disparities in Vision-Language Models

Paper • 2505.14729 • Published May 20 • 1

Nymbo

posted an update 3 months ago

Post

3740

Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

1 reply

·

Nymbo

posted an update 3 months ago

Post

2747

PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.

1024m

authored 3 papers 4 months ago

Robust and Fine-Grained Detection of AI Generated Texts

Paper • 2504.11952 • Published Apr 16 • 12

Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance

Paper • 2504.09753 • Published Apr 13 • 5

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published Apr 9 • 9

AtAndDev

posted an update 4 months ago

Post

3120

Llama 4 is out...

3 replies

·

AtAndDev

posted an update 5 months ago

Post

4347

There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...

6 replies

·

AtAndDev

posted an update 5 months ago

Post

1660

Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.

AtAndDev

posted an update 6 months ago

Post

2494

@nroggendorff is that you sama?

2 replies

·

AtAndDev

posted an update 6 months ago

Post

1942

everywhere i go i see his face

AtAndDev

posted an update 6 months ago

Post

577

Deepseek gang on fire fr fr

AtAndDev

posted an update 6 months ago

Post

1655

R1 is out! And with a lot of other R1 releated models...

AtAndDev

posted an update 8 months ago

Post

495

@s3nh Hey man check your discord! Got some news.

4 replies

·

1024m

authored 3 papers 9 months ago

RKadiyala at SemEval-2024 Task 8: Black-Box Word-Level Text Boundary Detection in Partially Machine Generated Texts

Paper • 2410.16659 • Published Oct 22, 2024

Large Language Models for Cross-lingual Emotion Detection

Paper • 2410.15974 • Published Oct 21, 2024 • 1

1024m at SMM4H 2024: Tasks 3, 5 & 6 -- Ensembles of Transformers and Large Language Models for Medical Text Classification

Paper • 2410.15998 • Published Oct 21, 2024 • 1

AI & ML interests

Team members 85

musika's activity