BharatAI RS_1: Transformer-Based Language Model

Overview

BharatAI RS_1 is a transformer-based language model designed for text generation. This repository contains the necessary components to train, fine-tune, and perform inference with BharatAI.

Installation

Before running the model, install the required dependencies:

pip install torch transformers datasets sentencepiece evaluate accelerate zstandard

File Structure

  • tokenizer.py - Defines the SentencePiece tokenizer.
  • model.py - Contains the BharatAI transformer architecture.
  • train.py - Script for training the model.
  • inference.py - Script for generating text using the trained model.
  • model.bin - Pre-generated model file.
  • tokenizer.model - Pre-generated tokenizer file.

Tokenizer

The tokenizer is based on SentencePiece and has been pre-generated. If you wish to train a new tokenizer, use:

import sentencepiece as spm
spm.SentencePieceTrainer.train(input='data.txt', model_prefix='tokenizer', vocab_size=1000)

Model Architecture

The BharatAI RS_1 model consists of multiple transformer blocks with self-attention mechanisms. It includes:

  • Multi-head self-attention
  • Feedforward layers
  • Layer normalization
  • Positional embeddings

Model Hyperparameters

The model uses the following default hyperparameters:

batch_size = 64
block_size = 256
max_iters = 250
learning_rate = 3e-4
eval_iters = 150
n_embd = 768
n_head = 12
n_layer = 12
dropout = 0.2

These can be adjusted in train.py or model.py as needed.

Training the Model

Important: The model is untrained by default

Users must train the model before using it for text generation. To train the model, run:

python train.py

This script loads the dataset, tokenizes text, and trains the transformer model from scratch.

Pre-Generated Model & Tokenizer

  • A pre-generated model (model.bin) and tokenizer (tokenizer.model) are included in the repository.
  • If you wish to use them, simply load them without retraining:
import torch
model = torch.load("model.bin") 

Generating Text

After training, or using the pre-generated model, you can generate text using:

python inference.py --input "Your prompt here"

Notes

  • The model is untrained by default, so users must train it first before inference.
  • Modify hyperparameters in train.py to optimize performance.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support