🔄 In a Training Loop

Arunkumar Venkataramanan

ArunkumarVR

·

https://arunkumarramanan.github.io

AI & ML interests

I build reasoning-first, world models and agentic AI systems for applied AI teams who need reliability in production—grounded in deep research on reasoning, safety and alignment, foundation models, and advanced training and inference techniques, including transformers and diffusion. I build for the real world—even when the technology is still catching up.

Recent Activity

upvoted an article about 2 months ago

The Open Source Community is backing OpenEnv for Agentic RL

liked a model about 2 months ago

google/diffusiongemma-26B-A4B-it

liked a model about 2 months ago

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

View all activity

Organizations

upvoted an article about 2 months ago

Article

The Open Source Community is backing OpenEnv for Agentic RL

+18

burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego, banghua, unseenmars

•

Jun 8

• 109

upvoted a collection 3 months ago

DeepSeek-V4

6 items • Updated Jun 27 • 759

upvoted an article 4 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

+5

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 918

upvoted a collection 5 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.74k

upvoted 6 collections 6 months ago

Nemotron-Cascade

Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 14 items • Updated 13 days ago • 55

DeepBrainz-R1 — Community Quantizations (GGUF & Low-Bit)

Community-maintained quantized variants (GGUF, low-bit) of DeepBrainz-R1 models; not officially trained, validated, or supported by DeepBrainz AI. • 9 items • Updated Feb 4 • 1

DeepBrainz-R1 — Reasoning-First SLMs for Agentic Systems

Reasoning-first Small Language Models for agentic AI systems in production. • 13 items • Updated Feb 4 • 1

NeMo Gym

Collection of RL verifiable data for NeMo Gym • 32 items • Updated 13 days ago • 62

FLUX.2

Our second generation of FLUX • 21 items • Updated Apr 6 • 262

FLUX.1

A collection of our FLUX.1 models and LoRAs. • 13 items • Updated Jan 2 • 339

upvoted a collection 7 months ago

Qwen3-Coder

5 items • Updated Dec 31, 2025 • 180

upvoted a paper 7 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 261

upvoted a collection 7 months ago

Moonlight-A3B

Moonshot's Compute-efficient MoE LLM, first Scaling Up of Muon Optimizer • 3 items • Updated Jan 27 • 14

upvoted a paper 7 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 130

upvoted a collection 7 months ago

FunctionGemma

3 items • Updated 8 days ago • 43

upvoted an article 7 months ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

nvidia

•

Dec 17, 2025

• 50

upvoted 2 collections 7 months ago

Nemotron v3 Pre-Training

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 13 days ago • 17

Common Pile v0.1

All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 41

upvoted an article 7 months ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

nvidia

•

Dec 15, 2025

• 114

upvoted a paper 8 months ago

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93