IntellIte Chat

IntellIte Chat is a lightweight conversational AI model (~45M parameters) designed for warm, engaging dialogue and basic reasoning. Part of the IntellIte series, it delivers efficient performance on modest hardware, complete with streaming data loading, episodic memory buffers, and RAG-based knowledge augmentation.


βš™οΈ Key Features

  • Small & Efficient: ~45M parameters, ideal for edge devices and academic projects.
  • Streaming Data: Uses Hugging Face IterableDataset for on-the-fly data without local storage constraints.
  • Memory Buffer: Maintains the last 200 messages for coherent multi-turn conversations.
  • RAG Integration: FAISS-based retrieval for up-to-date knowledge augmentation.
  • Content Safety: Built-in filters to enforce conversational guidelines.
  • Extensible API: Hook into generate_with_plugins() for custom prompts or downstream tasks.

πŸ’Ύ Installation

pip install transformers datasets faiss-cpu torch huggingface-hub

πŸš€ Quick Start

from il import generate_with_plugins

response = generate_with_plugins(
  prompt="Hello, how's it going?",
  source="wiki",
  k=3,
  max_new_tokens=100
)
print(response)

πŸ› οΈ Training Pipeline

Run the main training script:

export HF_TOKEN=<your_hf_token>
python il.py --hf_token $HF_TOKEN --seed 42

The script will:

  1. Stream Wikipedia, CodeParrot, and grade-school math datasets.
  2. Apply cosine LR scheduling, weight decay, and label smoothing.
  3. Run simple evals (2 chat, 1 code prompt) at each epoch end.
  4. Save & push the best model to ProCreations/IntellIte on Hugging Face.

πŸ“Š Evaluation & Monitoring

A SimpleEvalCallback runs designated chat/code prompts each epoch, logging outputs for quick sanity checks.


πŸ”§ Configuration Options

Edit il.py to customize:

  • Batch Sizes, LR, Scheduler via TrainingArguments.
  • Retrieval Sources: adjust k and index sources.
  • Memory Buffer: change size or filter rules.

🌱 Fine‑Tuning on Custom Data

  1. Prepare your dataset as a Hugging Face Dataset or IterableDataset.
  2. Interleave with base streams and pass to the Trainer.
  3. Use --resume_from_checkpoint to continue an interrupted run.

🀝 Contributing

Contributions welcome! Steps:

  1. Fork the repo.
  2. Create a feature branch.
  3. Submit a PR with clear descriptions and tests.

πŸ“œ License

This project is licensed under the Apache 2.0 License.


❀️ Developed by ProCreations under the IntellIte brand.

Downloads last month
0
Safetensors
Model size
150M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including ProCreations/Intellite-Chat