OGAI-8x7B-8bit-32k: 8-bit Quantized Oil & Gas AI Model with Extended Context
Model Description
OGAI-8x7B-8bit-32k is an 8-bit quantized version of the OGAI-8x7B model with a 32K token context window. This quantized model retains most of the capabilities of the original model while significantly reducing memory requirements, making it ideal for deployment in environments with memory constraints.
The model is based on a LoRA fine-tuned Mixtral-8x7B model, specifically engineered for oil and gas applications with a focus on drilling processes. The quantization to 8-bit precision offers a balanced approach between model size reduction and maintaining high-quality outputs for domain-specific tasks.
- Developed by: GainEnergy AI Team
- Model type: 8-bit Quantized Causal Language Model (Instruction Following)
- Language: English
- License: MIT
- Finetuned from model: GainEnergy/ogai-8x7b
- Quantization method: 8-bit (Int8)
- Context length: 32,768 tokens
Quantization Details
This model was quantized from the full-precision OGAI-8x7B using 8-bit quantization with the following configuration:
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_enable_fp32_cpu_offload=True
)
The 8-bit quantization reduces the model size by approximately 4x compared to FP16, while preserving approximately 95-98% of the original model's performance on oil and gas engineering tasks.
Key Capabilities
- Drilling Calculations & Optimization: Computes complex well trajectories, mud weight calculations, hydraulics, and casing designs.
- Engineering Knowledge Integration: Retains knowledge from oil & gas technical literature, drilling reports, and proprietary engineering datasets.
- Intelligent Document Processing: Supports knowledge retrieval for drilling workflows, regulatory compliance, and field operation manuals.
- High-Context Reasoning: The extended 32K token context window allows the model to retain context across long drilling plans, technical discussions, and simulation outputs.
Usage
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
# Configure quantization
quantization_config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_enable_fp32_cpu_offload=True
)
# Load tokenizer and model
model_id = "GainEnergy/ogai-8x7b-8bit-32k"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
quantization_config=quantization_config
)
# Example prompt for drilling engineering
prompt = "Calculate the required casing depth for a well with a pore pressure of 12.5 ppg."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Utilizing the Extended Context Window
To make the most of the 32K context window, you can input longer documents for analysis:
# Load long document (e.g., drilling report, technical specifications)
with open("long_drilling_report.txt", "r") as f:
long_document = f.read()
# Append a question at the end
prompt = f"{long_document}\n\nBased on the above document, what are the key risk factors identified for this drilling operation?"
# Process with appropriate truncation to fit within context
inputs = tokenizer(
prompt,
return_tensors="pt",
truncation=True,
max_length=32000 # Leave room for generation
).to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=768,
do_sample=True,
temperature=0.7,
top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Hardware Requirements
Due to the quantization, this model requires less GPU memory than the full-precision version:
- Minimum: CUDA-capable GPU with 16GB VRAM
- Recommended: CUDA-capable GPU with 24GB+ VRAM for comfortable usage with the 32K context window
- System RAM: 32GB+
Limitations
- Performance tradeoff: While 8-bit quantization preserves most capabilities, there may be slight reductions in accuracy for complex numerical computations compared to the full-precision model.
- Domain specificity: The model is focused on oil and gas drilling engineering and may not perform well for other domains.
- Expert validation: Outputs should be validated by domain experts before application in real-world engineering scenarios.
- Knowledge cutoff: The model's knowledge is limited to data available up to 2025.
Comparison with Other Variants
Model Variant | Precision | Context Length | Memory Requirements | Performance Retention | Ideal Use Case |
---|---|---|---|---|---|
OGAI-8x7B | Full (16-bit) | 32K | 64GB+ VRAM | 100% (Baseline) | High-precision engineering calculations |
OGAI-8x7B-8bit-32k | 8-bit | 32K | 16-24GB VRAM | ~95-98% | Balanced approach for deployment |
OGAI-8x7B-4bit | 4-bit (NF4) | 32K | 8-16GB VRAM | ~90-95% | Highly constrained environments |
Citation
@article{ogai8x7b2025,
title={OGAI-8x7B: An AI Model for Oil & Gas Drilling Engineering},
author={GainEnergy AI Team},
year={2025},
publisher={Hugging Face Models}
}
Acknowledgments
This model builds upon the work of the OGAI-8x7B base model and extends its capabilities through quantization and context length expansion. Special thanks to the Mixtral team for the base architecture that powers this model.
- Downloads last month
- 6