https://arxiv.org/abs/2509.02563
AI & ML interests
AI security & privacy, algorithmic bias, foundations of ML
Recent Activity
Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80.
How to extract style from images? Model, dataset, and the paper
Hugging Face collection for all things CLRS-Text
This collection contains artifacts from our paper titled: "Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs."
-
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Paper • 2406.10209 • Published • 8 -
tomg-group-umd/3-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 6 -
tomg-group-umd/4-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 5 -
tomg-group-umd/8-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 5
This collection contains models described in the refusal token paper published in COLM 2025.
-
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast
8B • Updated • 22 -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-multiple-tokens
8B • Updated • 2.47k -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-single-token
8B • Updated • 26 • 1 -
tomg-group-umd/zephyr-llama3-8b-sft-no-refusal-messages
8B • Updated • 13
LoRI adapters for natural language understanding, code generation, mathematical reasoning, and safety alignment, based on LLaMA-3-8B and Mistral-7B.
-
tomg-group-umd/LoRI-S_safety_mistral7b_rank_64
Text Generation • Updated • 5 • 1 -
tomg-group-umd/LoRI-S_safety_mistral7b_rank_32
Text Generation • Updated • 6 -
tomg-group-umd/LoRI-S_safety_llama3_rank_64
Text Generation • Updated • 11 -
tomg-group-umd/LoRI-S_safety_llama3_rank_32
Text Generation • Updated • 4
These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space.
-
tomg-group-umd/huginn-0125
Text Generation • 4B • Updated • 3.44k • 283 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 150 -
tomg-group-umd/huginn_swa_100_10_avg_0.9_merge
Text Generation • 4B • Updated • 3 -
tomg-group-umd/step-00010752-recurrence_full_512_0
Text Generation • 4B • Updated • 3
https://arxiv.org/abs/2509.02563
This collection contains models described in the refusal token paper published in COLM 2025.
-
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast
8B • Updated • 22 -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-multiple-tokens
8B • Updated • 2.47k -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-single-token
8B • Updated • 26 • 1 -
tomg-group-umd/zephyr-llama3-8b-sft-no-refusal-messages
8B • Updated • 13
LoRI adapters for natural language understanding, code generation, mathematical reasoning, and safety alignment, based on LLaMA-3-8B and Mistral-7B.
-
tomg-group-umd/LoRI-S_safety_mistral7b_rank_64
Text Generation • Updated • 5 • 1 -
tomg-group-umd/LoRI-S_safety_mistral7b_rank_32
Text Generation • Updated • 6 -
tomg-group-umd/LoRI-S_safety_llama3_rank_64
Text Generation • Updated • 11 -
tomg-group-umd/LoRI-S_safety_llama3_rank_32
Text Generation • Updated • 4
Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80.
These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space.
-
tomg-group-umd/huginn-0125
Text Generation • 4B • Updated • 3.44k • 283 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 150 -
tomg-group-umd/huginn_swa_100_10_avg_0.9_merge
Text Generation • 4B • Updated • 3 -
tomg-group-umd/step-00010752-recurrence_full_512_0
Text Generation • 4B • Updated • 3
How to extract style from images? Model, dataset, and the paper
Hugging Face collection for all things CLRS-Text
This collection contains artifacts from our paper titled: "Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs."
-
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Paper • 2406.10209 • Published • 8 -
tomg-group-umd/3-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 6 -
tomg-group-umd/4-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 5 -
tomg-group-umd/8-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 5