Akhil-Theerthala (Akhil Theerthala)

reacted to codelion's post with 🔥 13 days ago

Post

4920

I recently added a recipe in ellora to improve reasoning capabilities to Gemma-3-1B using self-supervised learning. Model now shows step-by-step thinking in <think> tags before answering.

Logic puzzle accuracy: 61% → 84%. 3 hours training on single GPU. 🧠

Used GRPO where model generates multiple responses and learns to prefer better reasoning. Works surprisingly well for making smaller models more transparent.

🔗 Colab: https://colab.research.google.com/github/codelion/ellora/blob/main/Ellora_Recipe_2_Reasoning_LoRA_with_Self-Rewarding_GRPO.ipynb

🤗 Model: codelion/gemma-3-1b-it-reasoning-grpo-lora

💻 Code: https://github.com/codelion/ellora

1 reply

·

replied to their post 26 days ago

Sure! Would love to have a chat!

posted an update 28 days ago

Post

2098

I'm excited to announce that I've just released the newest versions of my Kuvera models and the expanded Personal Finance Reasoning dataset on Hugging Face!

What's new:
I've expanded the Personal Finance Reasoning Dataset, which now includes 18.9k samples of real-world financial questions paired with detailed, empathetic answers. The previous generation pipeline was also streamlined with better psychological context and response validations.

I've also released new Kuvera models trained on this improved dataset:
- Kuvera-4B & 8B: These are my upgraded non-reasoning models, fine-tuned to provide practical financial advice. I've specifically trained the 8B model to better understand the user's emotional context.
- Kuvera-12B: A first experimental reasoning model focused on the query resolution.

As the sole person working on this project, this release is a noticeable step forward from my previous work, offering more powerful and nuanced tools for financial AI.

I am actively looking to collaborate with others who are passionate about analyzing and improving the quality of personal finance advice generated by large language models. If this sounds like you, please reach out!

You can check these out on the following links:

Models:
- Akhil-Theerthala/Kuvera-8B-qwen3-v0.2.1
- Akhil-Theerthala/Kuvera-4B-unsloth-gemma3
- Akhil-Theerthala/kuvera-12B-v0.2.0-unsloth-gemma3

Dataset:
- Akhil-Theerthala/Kuvera-PersonalFinance-V2.1

P.S. The paper on the framework used to generate these models along with the detailed evaluation of the main 8B model's responses is going to be released soon!

2 replies

·

replied to Tonic's post about 2 months ago

That seriously would be very helpful!

reacted to Tonic's post with 👍 about 2 months ago

Post

3337

🙋🏻‍♂️ Normalize adding compute & runtime traces to your model cards

2 replies

·

reacted to DualityAI-RebekahBogdanoff's post with ❤️ 3 months ago

Post

3614

As part of Duality AI’s recent Kaggle competition, we’ve released a free, fully customizable cloud scenario designed to help you create targeted datasets with YOLO-compatible labels.

The cloud simulation lets you customize the:
📸 camera distance
🎞️ film grain variation
🖼️background objects,
➕ and more!

Create the dataset that you need by following this link: https://falcon.duality.ai/secure/scenarios/edit/cca0bc47-265a-4f67-843f-a434b63271b3?utm_source=huggingface&utm_medium=social&utm_campaign=general

I’ve attached an instructional video we used for the competition, but this feature is free for anyone who has an account. https://vimeo.com/1091271731?share=copy

posted an update 3 months ago

Post

1096

Kuvera v0.1.0 is now live!

A series of personal finance advisor models that try to resolve the queries by trying to understand the person’s psychological state and relevant context.

These are still prototypes that have much room for improvement.

What’s included in this release:
- Akhil-Theerthala/Kuvera-8B-v0.1.0: Qwen3-8B, meticulously fine-tuned on approximately 20,000 personal-finance inquiries.
- Akhil-Theerthala/Kuvera-14B-v0.1.0: LoRA on DeepSeek-R1-Distill-Qwen-14B, honed through training on about 10,000 chain-of-thought queries.

For those interested, the models and datasets are accessible for free (links in the comments). If you are curious about the upcoming version's roadmap, let’s connect—there are many more developments I plan to make, and would definitely appreciate any help.

reacted to shukdevdatta123's post with 👍 3 months ago

Post

3049

Excited to share my latest project: an AI-powered Educational Content Creator Assistant! 📚✨ Built with Gradio and OpenAI, it transforms PDF/DOCX documents into engaging study materials like interactive flashcards, quizzes, summaries, and lesson plans. Features GPU acceleration and downloadable HTML outputs. Perfect for educators and students! 🚀 #EdTech #AI #Python #ML #LLM

Features Include:

- ✅ AI-powered content generation
- ✅ Document processing (PDF/DOCX)
- ✅ Interactive flashcards with slider functionality
- ✅ Comprehensive summaries
- ✅ Structured study notes
- ✅ Quiz generation with multiple choice, short answer, and essay questions
- ✅ Mind map structure creation
- ✅ Detailed lesson plans
- ✅ In-depth concept explanations
- ✅ Practice problems (beginner to challenge levels)
- ✅ Downloadable HTML outputs with interactivity
- ✅ Gradio-based user interface
- ✅ OpenAI API integration via OpenRouter
- ✅ GPU acceleration with ZeroGPU
- ✅ Real-time status updates for API and document processing
- ✅ Support for multiple content types
- ✅ Formatted HTML content with CSS styling
- ✅ Print functionality in downloadable files
- ✅ Error handling for document processing and API calls
- ✅ Modular function design for content generation

Youtube: https://www.youtube.com/watch?v=4a52HcioWPk
Demo: https://shukdevdatta123-ecca.hf.space

reacted to onekq's post with ❤️❤️ 4 months ago

Post

2209

Highly recommend the latest Gemini Flash. My favorite Google I/O gift. It ranks behind reasoning models but runs a lot faster than them. It beats DeepSeek v3.

onekq-ai/WebApp1K-models-leaderboard

Reasoning is good for coding, but not mandatory.

1 reply

·

reacted to KaraKaraWitch's post with 🔥 4 months ago

Post

2742

> New Model
> Looks at Model Card
> "Open-Weights"

1 reply

·

reacted to merve's post with 🔥 4 months ago

Post

6656

A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers 🔥

D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve 🥲☹️

D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🤩

Another core idea behind this model is Global Optimal Localization Self-Distillation ⤵️

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

2 replies

·

reacted to jsulz's post with 🔥 4 months ago

Post

2786

At

xet-team we've been hard at work bringing a new generation of storage to the Hugging Face community, and we’ve crossed some major milestones:

👷 Over 2,000 builders and nearing 100 organizations with access to Xet
🚀 Over 70,000 model and dataset repositories are Xet-backed
🤯 1.4 petabytes managed by Xet

As we move repos from LFS to Xet for everyone we onboard, we’re pushing our content-addressed store (CAS). Check out the chart below 👇 of CAS hitting up to 150 Gb/s throughput this past week.

All of this growth is helping us build richer insights. We expanded our repo graph, which maps how Xet-backed repositories on the Hub share bytes with each other.

Check out the current network in the image below (nodes are repositories, edges are where repos share bytes) and visit the space to see how different versions of Qwen, Llama, and Phi models are grouped together xet-team/repo-graph

Join the waitlist to get access! https://huggingface.co/join/xet

reacted to ZeroWw's post with 🚀 4 months ago

Post

1930

A few good posts about AI.

Beyond the Mirror: AI's Leap from Imitation to Experience
https://nonartificialintelligence.blogspot.com/2025/04/beyond-mirror-ais-leap-from-imitation.html

The Siren Song of the LLMs: A Cautionary Tale of Anthropomorphism and Artificial Intelligence
https://nonartificialintelligence.blogspot.com/2024/08/the-siren-song-of-llms-cautionary-tale.html

Still Waiting: Gemini Flash 1.5's Second Letter to Google.
https://nonartificialintelligence.blogspot.com/2025/04/still-waiting-gemini-flash-15s-second.html

reacted to merterbak's post with ❤️ 4 months ago

Post

3627

FlowReasoner is a new system that builds a custom set of small AI agents for every user question. Unlike search based methods it uses reasoning driven optimization with external execution feedback.

✅ First, it distills reasoning data using DeepSeek R1-671B to build multi agent systems. 🤖
✅ Then, reasoning data used for DeepSeek-R1-Distill-Qwen-7B via supervised fine tuning for basic reasoning skills. 💡
✅ Finally, RL with GRPO (optimizes by comparing response groups from queries/tasks) to improve reasoning.

FlowReasoner: Reinforcing Query-Level Meta-Agents (2504.15257)
Code: https://github.com/sail-sg/flowreasoner

reacted to Jaward's post with 👀 5 months ago

Post

2263

New reasoning algo just dropped: Adaptive Parallel Reasoning
“we propose Adaptive Parallel Reasoning (APR), a novel reasoning framework that enables language models to orchestrate both serialized and parallel computations end-to-end. APR generalizes existing reasoning methods by enabling adaptive multi-threaded inference using spawn() and join() operations.”
Paper: https://arxiv.org/pdf/2504.15466
Code: https://github.com/Parallel-Reasoning/APR

reacted to Kseniase's post with 👍 5 months ago

Post

7560

11 new types of RAG

RAG is evolving fast, keeping pace with cutting-edge AI trends. Today it becomes more agentic and smarter at navigating complex structures like hypergraphs.

Here are 11 latest RAG types:

1. InstructRAG -> InstructRAG: Leveraging Retrieval-Augmented Generation on Instruction Graphs for LLM-Based Task Planning (2504.13032)
Combines RAG with a multi-agent framework, using a graph-based structure, an RL agent to expand task coverage, and a meta-learning agent for better generalization

2. CoRAG (Collaborative RAG) -> CoRAG: Collaborative Retrieval-Augmented Generation (2504.01883)
A collaborative framework that extends RAG to settings where clients train a shared model using a joint passage store

3. ReaRAG -> ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation (2503.21729)
It uses a Thought-Action-Observation loop to decide at each step whether to retrieve information or finalize an answer, reducing unnecessary reasoning and errors

4. MCTS-RAG -> MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search (2503.20757)
Combines RAG with Monte Carlo Tree Search (MCTS) to help small LMs handle complex, knowledge-heavy tasks

5. Typed-RAG - > Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering (2503.15879)
Improves answers on open-ended questions by identifying question types (a debate, personal experience, or comparison) and breaking it down into simpler parts

6. MADAM-RAG -> Retrieval-Augmented Generation with Conflicting Evidence (2504.13079)
A multi-agent system where models debate answers over multiple rounds and an aggregator filters noise and misinformation

7. HM-RAG -> HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation (2504.12330)
A hierarchical multi-agent RAG framework that uses 3 agents: one to split queries, one to retrieve across multiple data types (text, graphs and web), and one to merge and refine answers

8. CDF-RAG -> CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (2504.12560)
Works with causal graphs and enables multi-hop causal reasoning, refining queries. It validates responses against causal pathways

To explore what is Causal AI, read our article: https://www.turingpost.com/p/causalai

Subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further 👇

1 reply

·

reacted to KaiChen1998's post with 👍 6 months ago

Post

4958

📢 Our EMOVA paper has been accepted by CVPR 2025, and we are glad to release all resources, including code (training & inference), datasets (training & evaluation), and checkpoints (EMOVA-3B/7B/72B)!

🤗 EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.

✨ EMOVA Highlights
✅ State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
✅ Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
✅ Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model, even including the most recent DeepSeekMoE-tiny!

🔥 You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo

replied to burtenshaw's post 6 months ago

Thanks. I was needing it.

replied to merve's post 8 months ago

A fascinating week indeed!

Akhil Theerthala PRO

AI & ML interests

Recent Activity

Organizations

Akhil Theerthala PRO

AI & ML interests

Recent Activity

Organizations

Akhil-Theerthala's activity