AtAndDev (alkinun)

replied to appvoid's post 4 days ago

Also, I do not think someone will achive AGI as we dont know what AGI is. I think we will just do incremental perf insreases, not an "unlock" that creates AGI.

replied to appvoid's post 4 days ago

In my pov, it should be open, if I can achieve AGI, someday someone will too. So theres no need to slow things down like eu. Just let things happen, accelerate and decentralize.

reacted to appvoid's post with 🔥 4 days ago

Post

3440

suppose someone is working on a reasoning model, which ends up unlocking achievements that lead to agi, should it be open source?

keep in mind everybody will have access to it: scientists, governments, terrorists, average people, etc...

11 replies

·

reacted to prithivMLmods's post with 👀👍 9 days ago

Post

4346

On the verge of releasing Poseidon-Reasoning-5M, a dataset built to excel in general thought processes, mathematics, and science across a diverse mixture of domains, I’m also dropping the Gargantua-R1-Compact dataset, a collection of over six million high-quality reasoning QA pair traces. 🤗🚀

✦ Gargantua-R1-Compact : prithivMLmods/Gargantua-R1-Compact

from datasets import load_dataset

dataset = load_dataset("prithivMLmods/Gargantua-R1-Compact", split="train")

Additionally, I’m adding the mini version of Gargantua — the Gargantua-R1-Wee : prithivMLmods/Gargantua-R1-Wee

from datasets import load_dataset

dataset = load_dataset("prithivMLmods/Gargantua-R1-Wee", split="train")

The composition spans 73.93% core mathematical reasoning involving problems, proofs, and computational challenges, 12.11% across diverse scientific domains such as physics, chemistry, biology, and interdisciplinary topics, 11.35% in competitive coding covering algorithms and data structures, 1.37% in academic science focusing on research-level methodology, 0.95% in creative and analytical reasoning through logic puzzles and problem-solving tasks, 0.25% in specialized technical areas like MLOps, LLMs, diffusion models, and CUDA, and 0.06% involving data from graphs and charts converted into structured JSON formats. Designed with both rich contextual depth and formal structural clarity, Gargantua-R1-Compact is an optimal resource for advancing research in symbolic reasoning, interpretability, and high-precision question answering in mathematical domains.

✦ Collection : prithivMLmods/gargantua-r1-mod-6896bfd7834e82b89ad2b38b

To know more about it, visit the dataset card of the respective dataset. !!

reacted to fdaudens's post with 👍🚀 9 days ago

Post

3304

OpenAI’s GPT-OSS has sparked ~400 new models on Hugging Face and racked up 5M downloads in less than a week, already outpacing DeepSeek R1’s first-week numbers.

For comparison: when R1 launched, I tracked 550 derivatives (across 8 base models) in a week, with ~3M downloads. GPT-OSS is ahead on adoption and engagement.

It’s also the most-liked release of any major LLM this summer. The 20B and 120B versions quickly shot past Kimi K2, GLM 4.5, and others in likes.

Most-downloaded GPT-OSS models include LM Studio and Unsloth AI versions:
1️⃣ openai/gpt-oss-20b - 2.0M
2️⃣ lmstudio-community/gpt-oss-20b-MLX-8bit - 750K
3️⃣ openai/gpt-oss-120b - 430K
4️⃣ unsloth/gpt-oss-20b-GGUF - 380K
5️⃣ lmstudio-community/gpt-oss-20b-GGUF - 330K

The 20B version is clearly finding its audience, showing the power of smaller, faster, more memory- and energy-efficient models. (These numbers don’t include calls to the models via inference providers, so the real usage is likely even bigger, especially for the 120B version)

Open-weight models let anyone build on top. Empower the builders, and innovation takes off. 🚀

1 reply

·

reacted to ovi054's post with 🔥 9 days ago

Post

3670

WAN 2.2 Text to Image ⚡

ovi054/wan2-2-text-to-image

We all know that WAN 2.2 A14B is a video model. But It turns out this video model can also produce great image results with incredible prompt adherence! The image output is sharp, detailed, and sticks to the prompt better than most.

👉 Try it now: ovi054/wan2-2-text-to-image

1 reply

·

replied to FlameF0X's post 10 days ago

sad

reacted to FlameF0X's post with 👀😔 10 days ago

Post

3816

I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore.

4 replies

·

reacted to sweatSmile's post with ❤️🚀 12 days ago

Post

2752

Teaching a 7B Model to Be Just the Right Amount of Snark

Ever wondered if a language model could get sarcasm? I fine-tuned Mistral-7B using LoRA and 4-bit quantisation—on just ~720 hand-picked sarcastic prompt–response pairs from Reddit, Twitter, and real-life conversations.

The challenge? Keeping it sarcastic but still helpful.

LoRA rank 16 to avoid overfitting

4-bit NF4 quantization to fit on limited GPU memory

10 carefully monitored epochs so it didn’t turn into a full-time comedian

Result: a model that understands “Oh great, another meeting” exactly as you mean it.

Read the full journey, tech details, and lessons learned on my blog:
Fine-Tuning Mistral-7B for Sarcasm with LoRA and 4-Bit Quantisation

Try the model here on Hugging Face: sweatSmile/Mistral-7B-Instruct-v0.1-Sarcasm.

replied to dhruv3006's post 12 days ago

Interesting! Did you have a chance to try k2, glm 4.5 or sonnet 4?

reacted to dhruv3006's post with 👀 12 days ago

Post

1948

GPT 5 for Computer Use agents.

Same tasks, same grounding model we just swapped GPT 4o with GPT 5 as the thinking model.

Left = 4o, right = 5.

Watch GPT 5 pull away.

Reasoning model: OpenAI GPT-5

Grounding model: Salesforce GTA1-7B

Action space: CUA Cloud Instances (macOS/Linux/Windows)

The task is: "Navigate to {random_url} and play the game until you reach a score of 5/5”....each task is set up by having claude generate a random app from a predefined list of prompts (multiple choice trivia, form filling, or color matching)"

Try it yourself here : https://github.com/trycua/cua

Docs : https://docs.trycua.com/docs/agent-sdk/supported-agents/composed-agents

1 reply

·

reacted to sweatSmile's post with ❤️ 13 days ago

Post

2828

Qwen3 is the latest version of the Qwen language models. It's smarter, faster, and now understands 119 languages instead of just 29.
It can do both deep reasoning and quick answers using a single model, depending on what you need.
The models range in size from small (0.6B) to huge (235B), with smart ways to save compute.
It's trained on 36 trillion tokens and fine-tuned in four steps to boost performance.
Qwen3 performs as well as or better than many top models, including some from big companies.
It’s fully open-source under licence. Amazing!!!

https://github.com/QwenLM/Qwen3/blob/main/Qwen3_Technical_Report.pdf

reacted to ImranzamanML's post with ❤️ 14 days ago

Post

3501

Finaly OpenAI is open to share open-source models after GPT2-2019.
gpt-oss-120b
gpt-oss-20b

openai/gpt-oss-120b

#AI #GPT #LLM #Openai

1 reply

·

reacted to georgewritescode's post with 🚀 14 days ago

Post

2788

Announcing Artificial Analysis Long Context Reasoning (AA-LCR), a new benchmark to evaluate long context performance through testing reasoning capabilities across multiple long documents (~100k tokens)

The focus of AA-LCR is to replicate real knowledge work and reasoning tasks, testing capability critical to modern AI applications spanning document analysis, codebase understanding, and complex multi-step workflows.

AA-LCR is 100 hard text-based questions that require reasoning across multiple real-world documents that represent ~100k input tokens. Questions are designed so answers cannot be directly found but must be reasoned from multiple information sources, with human testing verifying that each question requires genuine inference rather than retrieval.

Key takeaways:
➤ Today’s leading models achieve ~70% accuracy: the top three places go to OpenAI o3 (69%), xAI Grok 4 (68%) and Qwen3 235B 2507 Thinking (67%)

➤👀 We also already have gpt-oss results! 120B performs close to o4-mini (high), in-line with OpenAI claims regarding model performance. We will be following up shortly with a Intelligence Index for the models.

➤ 100 hard text-based questions spanning 7 categories of documents (Company Reports, Industry Reports, Government Consultations, Academia, Legal, Marketing Materials and Survey Reports)

➤ ~100k tokens of input per question, requiring models to support a minimum 128K context window to score on this benchmark

➤ ~3M total unique input tokens spanning ~230 documents to run the benchmark (output tokens typically vary by model)

We’re adding AA-LCR to the Artificial Analysis Intelligence Index, and taking the version number to v2.2. Artificial Analysis Intelligence Index v2.2 now includes: MMLU-Pro, GPQA Diamond, AIME 2025, IFBench, LiveCodeBench, SciCode and AA-LCR.

Link to dataset: ArtificialAnalysis/AA-LCR

reacted to fdaudens's post with ❤️ 15 days ago

Post

2599

Well, it took just 2 hours for openai/gpt-oss-120b to hit #1 on Hugging Face. Don’t remember seeing anything rise that fast!

1 reply

·

replied to danielhanchen's post 15 days ago

I wish OAI delivered as much as you..

alkinun PRO

AI & ML interests

Recent Activity

Organizations

alkinun PRO

AI & ML interests

Recent Activity

Organizations

AtAndDev's activity