merterbak (Mert Erbak)

reacted to clem's post with 🔥 9 days ago

Post

2522

Nice new space to see how fast your personal or organization followers are growing on HF:
julien-c/follow-history

As you can see, I still have more followers than @julien-c even if he's trying to change this by building such cool spaces 😝😝😝

reacted to openfree's post with 🔥 12 days ago

Post

4602

🚀 Idea Transformer:

Idea Transformer: Infinity is an innovative tool that unlocks infinite creativity by generating unique transformation ideas and design images from up to three keywords and a chosen category. Leveraging a state-of-the-art diffusion pipeline, real-time translation, and a powerful LLM, it delivers fresh ideas every time. 🎨✨

openfree/Idea-Transformer

Key Features

Diverse Ideas:
Randomly selects creative variations from your keywords and category — the possibilities are nearly endless! 🎲
Unique Design Images:
Your text prompt produces striking, varied design images via the diffusion model. 🖼️
Real-Time Translation & Expansion:
Korean inputs are automatically translated and enriched using an advanced LLM for high-quality output. 🔄
Dual-Language Support:
Enjoy an intuitive Gradio interface with separate English and Korean tabs for a global audience. 🌍
Explore a Wide Range of Categories:

Sensor Functions 📡: Creative changes in sensor technologies.
Size & Shape Change 📏: Ideas altering physical dimensions and forms.
Surface & Appearance Change 🎨: Transformations in color, texture, and visual effects.
Material State Change 🔥: Transitions between different material states.
Movement Characteristics Change 🏃‍♂️💨: Innovations in motion, speed, and vibration.
Structural Change 🛠️: Reconfigurations via assembly/disassembly and design modifications.
Spatial Movement 🚀: Ideas on repositioning and directional shifts.
Time-Related Change ⏳: Concepts influenced by aging, wear, and lifecycle.
Light & Visual Effects 💡: Alterations in illumination, transparency, and holographic effects.
Sound & Vibration Effects 🔊: Innovations in auditory and vibrational dynamics.
Business Ideas 💼: Strategies for market redefinition, business model innovation, and more.
Why Choose Idea Transformer?

Infinite Creativity & Cutting-Edge Technology : Your keywords and randomized transformations produce an endless stream of unique ideas!

reacted to clem's post with ❤️ 21 days ago

Post

4674

10,000+ models based on Deepseek R1 have been publicly shared on Hugging Face! Which ones are your favorite ones: https://huggingface.co/models?sort=trending&search=r1. Truly game-changer!

reacted to Smooke's post with 🔥 23 days ago

Post

2151

reacted to davidberenstein1957's post with 🚀 24 days ago

Post

4225

🥊 Epic Agent Framework Showdown! Available today!

🔵 In the blue corner, the versatile challenger with a proven track record of knowledge retrieval: LlamaIndex!

🛑 In the red corner, the defender, weighing in with lightweight efficiency: Hugging Face smolagents!

🔗 URL: https://huggingface.co/agents-course

We just published the LlamaIndex unit for the agents course, and it is set to offer a great contrast between the smolagents unit by looking at

- What makes llama-index stand-out
- How the LlamaHub is used for integrations
- Creating QueryEngine components
- Using agents and tools
- Agentic and multi-agent workflows

The team has been working flat-out on this for a few weeks. Supported by Logan Markewich and Laurie Voss over at LlamaIndex.

Who won? You decide!

reacted to di-zhang-fdu's post with 🔥 26 days ago

Post

2732

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning (2411.18203)
Critic-V has been accepted by CVPR2025!
Bonus! VRI-160K uploaded now!
di-zhang-fdu/R1-Vision-Reasoning-Instructions

reacted to Jaward's post with 🤗 27 days ago

Post

4959

made a few improvements on custom grpo trainer:
- added sequence similarity reward (seems to work)
- improved vllm support (5x inference speed)
- adjusted reward scores (this helped with format/accuracy)
- can now push to hf hub (already pushed mine lol: Jaward/smollm2_360m_grpo_gsm8k_reasoner)

Code: https://github.com/Jaykef/ai-algorithms/blob/main/smollm2_360M_135M_grpo_gsm8k.ipynb

reacted to DualityAI-RebekahBogdanoff's post with 🔥 29 days ago

Post

3111

✨🎉Duality.ai just released a multiclass object detection dataset for YOLOv8, as well as a tutorial on how to create your own multiclass dataset!

Carefully crafted (not GenAI created) synthetic data that ACTUALLY trains a model that works in the physical world.

Create a free FalconEDU account, and download the 1000 image and annotation dataset - https://falcon.duality.ai/secure/documentation/ex3-dataset?sidebarMode=learn
-or-
Follow along with Exercise 3: Multiclass Object Detection to start creating - https://falcon.duality.ai/secure/documentation/ex3-objdetection-multiclass
-or-
Download this Colab notebook to see the data work, no hardware required - https://falcon.duality.ai/secure/documentation/ex3-dataset?sidebarMode=learn

reacted to burtenshaw's post with 🔥 about 1 month ago

Post

6309

Now the Hugging Face agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.

🔗 Follow the org for updates https://huggingface.co/agents-course

This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:

- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use

The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .

reacted to prithivMLmods's post with 🚀 about 1 month ago

Post

5839

It's really interesting about the deployment of a new state of matter in Majorana 1: the world’s first quantum processor powered by topological qubits. If you missed this news this week, here are some links for you:

🅱️Topological qubit arrays: https://arxiv.org/pdf/2502.12252

⚛️ Quantum Blog: https://azure.microsoft.com/en-us/blog/quantum/2025/02/19/microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits/

📖 Read the story: https://news.microsoft.com/source/features/innovation/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/

📝 Majorana 1 Intro: https://youtu.be/Q4xCR20Dh1E?si=Z51DbEYnZFp_88Xp

🌀The Path to a Million Qubits: https://youtu.be/wSHmygPQukQ?si=TS80EhI62oWiMSHK

3 replies

·

reacted to mmhamdy's post with 🔥 about 1 month ago

Post

2745

🎉 We're excited to introduce MemoryCode, a novel synthetic dataset designed to rigorously evaluate LLMs' ability to track and execute coding instructions across multiple sessions. MemoryCode simulates realistic workplace scenarios where a mentee (the LLM) receives coding instructions from a mentor amidst a stream of both relevant and irrelevant information.

💡 But what makes MemoryCode unique?! The combination of the following:

✅ Multi-Session Dialogue Histories: MemoryCode consists of chronological sequences of dialogues between a mentor and a mentee, mirroring real-world interactions between coworkers.

✅ Interspersed Irrelevant Information: Critical instructions are deliberately interspersed with unrelated content, replicating the information overload common in office environments.

✅ Instruction Updates: Coding rules and conventions can be updated multiple times throughout the dialogue history, requiring LLMs to track and apply the most recent information.

✅ Prospective Memory: Unlike previous datasets that cue information retrieval, MemoryCode requires LLMs to spontaneously recall and apply relevant instructions without explicit prompts.

✅ Practical Task Execution: LLMs are evaluated on their ability to use the retrieved information to perform practical coding tasks, bridging the gap between information recall and real-world application.

📌 Our Findings

1️⃣ While even small models can handle isolated coding instructions, the performance of top-tier models like GPT-4o dramatically deteriorates when instructions are spread across multiple sessions.

2️⃣ This performance drop isn't simply due to the length of the context. Our analysis indicates that LLMs struggle to reason compositionally over sequences of instructions and updates. They have difficulty keeping track of which instructions are current and how to apply them.

🔗 Paper: From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions (2502.13791)
📦 Code: https://github.com/for-ai/MemoryCode

reacted to lysandre's post with ❤️ about 1 month ago

Post

5919

SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.

1 reply

·

reacted to onekq's post with 👀 about 1 month ago

Post

2053

Still waiting for 👽Grok👽 3 API ⌛😞😫

reacted to their post with 🚀 about 1 month ago

Post

3680

🔥 Meet Muse: that can generate a game environment based on visuals or players’ controller actions. It was developed by Microsoft Research in collaboration with Ninja Theory (Hellblade developer). It’s built on something called the World and Human Action Model (WHAM-1.6B model). They trained on 7 years of Bleeding Edge gameplay and it can generate 2 minute long 3D game sequences with consistent physics and character behaviors all from just a second of input. They’ve gone and open-sourced it too. Open weights, the WHAM Demonstrator, and sample data on Azure AI Foundry for anyone to play with. Hope so soon on Hugging Face 🤗.

📄 Paper: https://www.nature.com/articles/s41586-025-08600-3
Blog Post: https://www.microsoft.com/en-us/research/blog/introducing-muse-our-first-generative-ai-model-designed-for-gameplay-ideation/

1 reply

·

replied to their post about 1 month ago

It’s been added to Hugging Face 🤗.
Model: https://huggingface.co/microsoft/wham
Sample Dataset: https://huggingface.co/datasets/microsoft/bleeding-edge-gameplay-sample

reacted to merve's post with 🚀 about 1 month ago

Post

6428

Google just released PaliGemma 2 Mix: new versatile instruction vision language models 🔥

> Three new models: 3B, 10B, 28B with res 224, 448 💙
> Can do vision language tasks with open-ended prompts, understand documents, and segment or detect anything 🤯

Read more https://huggingface.co/blog/paligemma2mix
Try the demo google/paligemma2-10b-mix
All models are here google/paligemma-2-mix-67ac6a251aaf3ee73679dcc4

reacted to burtenshaw's post with 🚀 about 1 month ago

Post

7403

AGENTS + FINETUNING! This week Hugging Face learn has a whole pathway on finetuning for agentic applications. You can follow these two courses to get knowledge on levelling up your agent game beyond prompts:

1️⃣ New Supervised Fine-tuning unit in the NLP Course https://huggingface.co/learn/nlp-course/en/chapter11/1
2️⃣New Finetuning for agents bonus module in the Agents Course https://huggingface.co/learn/agents-course/bonus-unit1/introduction

Fine-tuning will squeeze everything out of your model for how you’re using it, more than any prompt.

2 replies

·

posted an update about 1 month ago

Post

3680

🔥 Meet Muse: that can generate a game environment based on visuals or players’ controller actions. It was developed by Microsoft Research in collaboration with Ninja Theory (Hellblade developer). It’s built on something called the World and Human Action Model (WHAM-1.6B model). They trained on 7 years of Bleeding Edge gameplay and it can generate 2 minute long 3D game sequences with consistent physics and character behaviors all from just a second of input. They’ve gone and open-sourced it too. Open weights, the WHAM Demonstrator, and sample data on Azure AI Foundry for anyone to play with. Hope so soon on Hugging Face 🤗.

📄 Paper: https://www.nature.com/articles/s41586-025-08600-3
Blog Post: https://www.microsoft.com/en-us/research/blog/introducing-muse-our-first-generative-ai-model-designed-for-gameplay-ideation/

1 reply

·

reacted to fdaudens's post with ❤️ about 1 month ago

Post

5819

🎯 Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses.

Check it out: perplexity-ai/r1-1776
Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

1 reply

·

reacted to clem's post with ❤️ about 1 month ago

Post

3494

We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co/blog/inference-providers

Mert Erbak PRO

AI & ML interests

Recent Activity

Organizations

merterbak's activity