Text Generation
GGUF
English
code
finance
imatrix
conversational
Preview

🎯 Overview

Bro is a fine-tuned variant of the gemma-3-4b-it transformer model, optimized for enhanced contextual comprehension, instruction following, and domain-specific reasoning. The fine-tuning process used supervised instruction tuning across multiple NLP domains, with a focus on factual recall, multi-step reasoning, and document comprehension.

  • Built on the lightweight yet powerful Gemma 3 4B architecture, Bro provides a balance between inference speed and linguistic depth β€” making it suitable for both production deployment and academic research.

βš™οΈ Vectorized Datasets

Vectorization is the process of converting textual data into numerical vectors and is a process that is usually applied once the text is cleaned. It can help improve the execution speed and reduce the training time of your code. BudgetPy provides the following vector stores on the OpenAI platform to support environmental data analysis with machine-learning

  • Appropriations - Enacted appropriations from 1996-2024 available for fine-tuning learning models
  • Regulations - Collection of federal regulations on the use of appropriated funds
  • SF-133 - The Report on Budget Execution and Budgetary Resources
  • Balances - U.S. federal agency Account Balances (File A) submitted as part of the DATA Act 2014.
  • Outlays - The actual disbursements of funds by the U.S. federal government from 1962 to 2025
  • SF-133 The Report on Budget Execution and Budgetary Resources
  • Balances - U.S. federal agency Account Balances (File A) submitted as part of the DATA Act 2014.
  • Circular A11 - Guidance from OMB on the preparation, submission, and execution of the federal budget
  • Fastbook - Treasury guidance on federal ledger accounts
  • Title 31 CFR - Money & Finance
  • Redbook - The Principles of Appropriations Law (Volumes I & II).
  • US Standard General Ledger - Account Definitions
  • Treasury Appropriation Fund Symbols (TAFSs) Dataset - Collection of TAFSs used by federal agencies

✨ Features

Feature Description
πŸ” Instruction-Tuned Fine-tuned on a diverse corpus of natural language tasks for generalization
πŸ“š Multi-Domain Trained on QA, summarization, reasoning, and code synthesis datasets
⚑ Optimized for RAG Performs well when integrated with retrieval-augmented generation pipelines
🧩 Multi-Turn Dialogue Supports coherent conversations with context memory
🧠 Compact Intelligence 4B parameter scale enables fast inference on consumer GPUs

πŸ§ͺ Intended Use

Bro is intended for use in:

  • Knowledge retrieval systems (RAG)
  • Instruction following assistants
  • Legal/financial document understanding
  • Open-ended question answering
  • Text generation and summarization
  • Fine-tuning foundation for further specialization

πŸ”¬ Technical Details

Base Model

  • Model: gemma-3-4b-pt
  • Parameters: ~4.1 Billion
  • Architecture: Transformer decoder-only
  • Tokenizer: SentencePiece (32k vocab)
  • Positional Encoding: Rotary (RoPE)
  • Attention: Multi-head Self-Attention (MHA)
  • Training Framework: PyTorch / Hugging Face Transformers

βš™οΈ Fine-Tuning

Property Value
Dataset Composition 60% OpenAssistant-style instructions, 20% legal+financial, 10% reasoning chains, 10% dialogues
Optimization Strategy Supervised fine-tuning (SFT)
Epochs 3
Optimizer AdamW
Scheduler Cosine decay with warmup
Mixed Precision FP16
Context Window 8192 tokens

πŸ§ͺ Benchmark Results

Task Metric Bro (Ours) Base gemma-3-4b
ARC Challenge (25-shot) Accuracy (%) 71.3 64.5
NaturalQuestions (RAG) EM/F1 51.7 / 63.9 44.2 / 56.8
GSM8K (reasoning) Accuracy (%) 62.5 52.0
Summarization (CNN/DM) ROUGE-L 42.1 37.6
MMLU (5-shot, avg) Accuracy (%) 56.2 48.8

🧠 Fine-tuned Bro outperforms base Gemma across all tasks, especially multi-hop reasoning and retrieval QA.


πŸš€ Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("your-org/Bro")
tokenizer = AutoTokenizer.from_pretrained("your-org/Bro")

prompt = "Explain the difference between supervised and unsupervised learning:"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Downloads last month
-
GGUF
Model size
1,000M params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for leeroy-jankins/bro

Quantized
(18)
this model

Datasets used to train leeroy-jankins/bro