Model Card for nexa-OLMo-sci7b

Model Details

Model Description:
nexa-OLMo-sci7b is a fine-tuned variant of allenai/OLMo-7B, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using PEFT with LoRA in 4-bit quantized mode via bitsandbytes.

Developed by: Allan (Independent Scientific Intelligence Architect)
Shared by: Allan (https://huggingface.co/allan-wandia)
Model type: Decoder-only transformer (causal language model)
Language(s): English (scientific domain-specific vocabulary)
License: Apache 2.0
Fine-tuned from: allenai/OLMo-7B
Repository: https://huggingface.co/allan-wandia/nexa-olmo-sci7b

Training Details

Training Data:

  • Size: 100 million tokens
  • Source: Curated scientific literature (Bio, Physics, QST, Astro)

Hyperparameters:

  • Sequence length: 1024
  • Batch size: 1
  • Gradient Accumulation Steps: 64
  • Effective Batch Size: 64
  • Learning rate: 2e-05
  • Epochs: 2
  • LoRA: Enabled (PEFT)
  • Quantization: 4-bit

Results:
Robust performance in scientific prose tasks, with novelty varying by prompt diversity.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Allanatrix/Nexa-OLMo-sci7b

Base model

allenai/OLMo-7B
Adapter
(1)
this model

Dataset used to train Allanatrix/Nexa-OLMo-sci7b

Collection including Allanatrix/Nexa-OLMo-sci7b