GainEnergy/OGAI-24B API

This repository contains the code to deploy the GainEnergy/OGAI-24B model as an API service on Hugging Face Spaces.

Features

Full REST API with FastAPI
Streaming support
Configurable generation parameters
Docker support for easy deployment
Chat-formatted responses following the message role format (system, user, assistant)

Deployment on Hugging Face Spaces

Create a new Space on Hugging Face:
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Choose a name for your Space
- Select "Docker" as Space SDK
- Select "GPU" as Hardware (A10G or better recommended)
Upload the following files to your Space:
- app.py
- requirements.txt
- Dockerfile
- .gitattributes (optional, for handling large files)
The Space will automatically build and deploy the Docker image.

Environment Variables

You can configure the following environment variables in your Hugging Face Space:

MODEL_ID: The model ID to load (default: "GainEnergy/OGAI-24B")
DEFAULT_MAX_LENGTH: Default maximum length for generation (default: 2048)
DEFAULT_TEMPERATURE: Default temperature for generation (default: 0.7)

API Usage

Generate Text

Endpoint: POST /generate

Request Body:

{
  "messages": [
    {
      "role": "system",
      "content": "You are an assistant specialized in oil and gas engineering."
    },
    {
      "role": "user",
      "content": "Explain the principles of reservoir simulation."
    }
  ],
  "temperature": 0.7,
  "max_tokens": 2048,
  "top_p": 0.95,
  "top_k": 50,
  "stream": false
}

Response:

{
  "generated_text": "Reservoir simulation is a computational method used to predict the flow of fluids (oil, gas, and water) through porous media over time..."
}

Stream Generated Text

Endpoint: POST /generate_stream

This endpoint supports Server-Sent Events (SSE) for streaming responses. Set "stream": true in your request body.

Local Development

Clone the repository:

git clone https://huggingface.co/spaces/your-username/your-space-name
cd your-space-name

Install dependencies:

pip install -r requirements.txt

Run the application:

python app.py

The API will be available at http://localhost:7860

License

This project is licensed under the Apache 2.0 License - see the original model card for details.