VoxPolska GGUF: Next-Gen Polish Voice Generation

πŸ“Œ Model Highlights

  • Context-Aware Voice: Generates speech that captures the nuances and tone of the Polish language.
  • Realistic Speech Output: Produces fluent and expressive speech with natural intonations, ideal for a variety of use cases.
  • Showcases Advanced Speech Synthesis: Highlights proficiency in generating human-like voice output tailored to the Polish language.
  • Optimized for GGUF: Fine-tuned specifically for GGUF implementation, offering fast and efficient inference.
  • Advanced Deep Learning: Utilizes cutting-edge deep learning techniques, enabling optimal performance across various applications.

πŸ”§ Technical Details

  • Base Model: Orpheus TTS
  • LoRA (Low-Rank Adaptation): Fine-tuning applied for enhanced performance and efficiency.
  • Sample Rate: 24 kHz audio output ensuring high-fidelity sound.
  • Training Data: Trained with 24000+ Polish transcript and audio pairs, ensuring natural speech generation.
  • Quantization: Merged 16-bit quantization for balanced performance and memory efficiency.
  • Audio Decoding: Custom layer-wise processing for high-quality audio generation.
  • Repetition Penalty: Set to 1.1 to avoid repetitive phrases and enhance speech naturalness.
  • Gradient Checkpointing: Enabled for efficient memory usage and faster inference on constrained environments.

🎧 Example Usage

  1. Using with LM Studio

    You need to have Python 3.8+ installed on your computer.

    Steps

    1. Install and launch LM Studio
    2. Download the GGUF file
      • Download 4 bit version
      • Download 5 bit version
      • Download 8 bit version
      • If you would like to load the model from LM Studio directly, type salihfurkaan/voxpolska-v1-gguf and choose your preferred version.
    3. Load the GGUF file
      • Skip this step if you have loaded the model from LM Studio directly. If not, follow the following steps:
      • Click on "My Models". You will see the models directory. Go to that path.
      • In "models" folder, create a new folder named "salihfurkaan" and go to this folder.
      • In "salihfurkaan", create a new folder named "VoxPolska-V1-GGUF"
      • In "VoxPolska-V1-GGUF", put your gguf file(s)
    4. Start the local server
      • Click on "Developer" sidebar on LM Studio
      • Press CTRL + L and load the model.
      • Press CTRL + R to start the local server
    5. Clone orpheus-tts-local repository and install the dependencies
      git clone https://github.com/isaiahbjork/orpheus-tts-local.git
      cd orpheus-tts-local
      python3 -m venv venv
      source venv/bin/activate  # On Windows: venv\Scripts\activate
      pip install -r requirements.txt
      
    6. Put your Huggingface token to the files in orpheus-tts-local folder and save the file
      • Add the code below in the Python file and save the file.
        import os
        os.environ["HF_TOKEN"] = "your huggingface token here"
        
      • You can get your token from here
    7. Run the model
      • Run the bash command below:
        python gguf_orpheus.py --text "Your Polish text here" --output output.wav
        

    You can access to output.wav in orpheus-tts-local folder.

    Available Flags

  • --text: The text to convert to speech (required)
  • --voice: The voice to use (default is "tara")
  • --output: Output WAV file path (default: auto-generated filename)
  • --temperature: Temperature for generation (default: 0.6)
  • --top_p: Top-p sampling parameter (default: 0.9)
  • --repetition_penalty: Repetition penalty (default: 1.1)
  • --backend: Specify the backend (default: "lmstudio", also supports "ollama")
  1. Using Llama.cpp
    You need to have CMake installed on your computer.

    1. Install llama.cpp:
    • Use the command below to install and build llama.cpp
      git clone https://github.com/ggerganov/llama.cpp
      cd llama.cpp
      cmake -B build
      cmake --build build --config Release
    
    1. Download the GGUF file
    2. Start the server
      • Use the command below
        ./llama-server -m path/to/gguf/file --port 8080
        
    3. Clone orpheus-tts-local repository and install the dependencies
      git clone https://github.com/isaiahbjork/orpheus-tts-local.git
      cd orpheus-tts-local
      python3 -m venv venv
      source venv/bin/activate  # On Windows: venv\Scripts\activate
      pip install -r requirements.txt
      
    4. Put your Huggingface token to the files in orpheus-tts-local folder and save the file
      • Add the code below in the Python file and save the file.
        import os
        os.environ["HF_TOKEN"] = "your huggingface token here"
        
      • You can get your token from here
    5. Run the model
      • Run the bash command below:
        python gguf_orpheus.py --text "Your Polish text here" --output output.wav
        

    You can access to output.wav in orpheus-tts-local folder.

πŸ“« Contact and Support

For questions, suggestions, and feedback, please open an issue on HuggingFace. You can also reach out via: LinkedIn

Model Misuse

Do not use this model for impersonation without consent, misinformation or deception (including fake news or fraudulent calls), or any illegal or harmful activity. By using this model, you agree to follow all applicable laws and ethical guidelines.

Citation

@misc{
  title={salihfurkaan/VoxPolska-V1-GGUF},
  author={Salih Furkan Erik},
  year={2025},
  url={https://huggingface.co/salihfurkaan/VoxPolska-GGUF/}
}
Downloads last month
130
GGUF
Model size
3.3B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for salihfurkaan/VoxPolska-V1-GGUF

Dataset used to train salihfurkaan/VoxPolska-V1-GGUF

Collection including salihfurkaan/VoxPolska-V1-GGUF