Spaces:

zaiffi
/

Mehfil-e-Sukhan

Sleeping

App Files Files Community

zaiffi commited on Mar 16

Commit

ce55859

1 Parent(s): ffbd65c

Add comprehensive README with app configuration and usage instructions

Browse files

Files changed (1) hide show

README.md +114 -1

README.md CHANGED Viewed

	@@ -1 +1,114 @@
1	- ~~# Mehfil-e-Sukhan~~

+---
+title: Mehfil-e-Sukhan
+emoji: 📜
+colorFrom: "#E64A4A"
+colorTo: "#1C1C1C"
+sdk: streamlit
+sdk_version: "1.43.0"
+app_file: app.py
+pinned: false
+---
+# Mehfil-e-Sukhan: Har Lafz Ek Mehfil
+An AI-powered Roman Urdu poetry generation application using BiLSTM neural networks.
+## Overview
+Mehfil-e-Sukhan ("Poetry Gathering" in Urdu) is an interactive application that generates Roman Urdu poetry based on a starting word or phrase provided by the user. The application uses a Bidirectional LSTM neural network trained on a curated dataset of Roman Urdu poetry.
+## Features
+- **Custom Poetry Generation**: Generate Roman Urdu poetry from any starting word or phrase.
+- **Adjustable Parameters**:
+  - **Number of Words**: Control the length of generated poetry (12-48 words).
+  - **Creativity (Temperature)**: Adjust the randomness in word selection (0.5-2.0).
+  - **Focus (Top-p)**: Fine-tune how closely the model adheres to probable word sequences (0.5-1.0).
+- **Elegant Interface**: Dark-themed UI designed specifically for poetry presentation.
+- **Automatic Formatting**: Output is automatically formatted into poetic lines.
+## How to Use
+1. Enter a starting word or phrase in Roman Urdu (e.g., "ishq", "zindagi", "mohabbat").
+2. Adjust the generation parameters:
+   - Number of Words: Select how many words you want in your poem.
+   - Creativity: Higher values (>1.0) produce more unique but potentially less coherent poetry. Lower values (<1.0) create more predictable output.
+   - Focus: Higher values make the AI stick to more probable word combinations.
+3. Click "Generate Poetry" and wait for your custom poem to appear.
+## Technical Details
+- **Model**: Bidirectional LSTM with 3 layers
+- **Tokenization**: SentencePiece with BPE encoding
+- **Vocabulary Size**: 12,000 tokens
+- **Text Generation**: Nucleus (top-p) sampling for balanced creativity and coherence
+## Installation for Local Development
+If you want to run the application locally:
+```bash
+# Clone the repository
+git clone https://github.com/yourusername/Mehfil-e-Sukhan.git
+cd Mehfil-e-Sukhan
+# Create and activate a virtual environment (optional but recommended)
+python -m venv venv
+source venv/bin/activate  # On Linux/Mac
+# or
+venv\Scripts\activate  # On Windows
+# Install dependencies
+pip install -r requirements.txt
+# Run the application
+streamlit run app.py
+```
+## Requirements
+- Python 3.8+
+- torch==2.6.0
+- sentencepiece==0.2.0
+- huggingface-hub==0.29.3
+- streamlit==1.43.0
+## Project Structure
+```
+Mehfil-e-Sukhan/
+├── app.py              # Main application file
+├── requirements.txt    # Python dependencies
+└── README.md           # This documentation
+```
+The model weights and SentencePiece model are stored on Hugging Face Hub and are downloaded automatically when the application runs.
+## How It Works
+1. **Data Processing**: The model was trained on a curated dataset of Roman Urdu poetry lines.
+2. **Tokenization**: Text was tokenized using SentencePiece's BPE algorithm.
+3. **Model Training**: A Bidirectional LSTM architecture was trained to predict the next token in a sequence.
+4. **Text Generation**: At inference time, nucleus sampling is used to select the next word with a balance of creativity and coherence.
+5. **Formatting**: Generated text is automatically formatted into lines with alternating indentation for aesthetic presentation.
+## Limitations
+- The current model was trained on a relatively small dataset (~1300 lines), which may occasionally result in repetitive patterns.
+- Roman Urdu is not standardized, so the model may struggle with unusual spellings or transliterations.
+- Generation speed depends on available computational resources.
+## License
+This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
+## Contact
+- LinkedIn: [Muhammad Huzaifa Saqib](https://www.linkedin.com/in/muhammad-huzaifa-saqib-90a1a9324/)
+- GitHub: [zaiffishiekh01](https://github.com/zaiffishiekh01)
+- Email: [[email protected]](mailto:[email protected])
+## Acknowledgements
+- Poetry is the rhythmical creation of beauty in words - Edgar Allan Poe