A newer version of the Gradio SDK is available:
5.49.1
โ Voice Library Enhancement Complete
๐ฏ Problem Solved
The Voice Library UI was missing advanced TTS parameters (Min-P, Top-P, Repetition Penalty) that were available in the backend but not exposed to users.
๐ ๏ธ Changes Made
1. Enhanced Voice Profile Storage โ๏ธ
- Updated
save_voice_profile()function to accept and store:- Min-P (default: 0.05) - Minimum probability threshold
- Top-P (default: 1.0) - Nucleus sampling threshold
- Repetition Penalty (default: 1.2) - Token repetition control
- Incremented version to v2.1 for backward compatibility
- Enhanced status messages to show advanced settings
2. Enhanced Voice Profile Loading ๐ฅ
- Updated
load_voice_profile()function to return new parameters - Added backward compatibility - old voice profiles get sensible defaults
- Enhanced status messages to show profile version
3. New Voice Library UI Controls ๐๏ธ
Added "Advanced Voice Parameters" section in Voice Library tab:
๐๏ธ Advanced Voice Parameters
โโโ Min-P (0.01-0.5) - "Minimum probability threshold for token selection (lower = more diverse)"
โโโ Top-P (0.1-1.0) - "Nucleus sampling threshold (lower = more focused)"
โโโ Repetition Penalty (1.0-2.0) - "Penalty for repeating tokens (higher = less repetition)"
4. Enhanced TTS Generation ๐ต
- Updated core
generate()function to accept new parameters - Updated
generate_with_cpu_fallback()function for fallback mode - Updated
generate_with_retry()function for robust generation - All TTS calls now use voice-specific advanced parameters
5. Enhanced Voice Configuration ๐
- Updated
get_voice_config()function to include new parameters - All audiobook generation now uses saved voice settings
- Backward compatibility maintained for existing voices
6. UI Integration ๐
- Save Button: Now includes all 3 new parameters in voice profiles
- Load Button: Populates all UI sliders with saved values
- Test Button: Uses advanced parameters for voice testing
๐ฎ User Experience
Before โ
- Only basic parameters: Exaggeration, CFG/Pace, Temperature
- Advanced TTS controls were hidden and inaccessible
- All voices used default Min-P/Top-P/Rep-Penalty values
After โ
- Full control over TTS generation parameters
- Professional voice tuning with industry-standard controls
- Per-voice customization - each voice can have unique settings
- Backward compatibility - existing voices continue working
- Enhanced voice testing with all parameters
๐ Technical Benefits
Voice Quality Control ๐ญ
- Min-P: Fine-tune creativity vs consistency
- Top-P: Control focus vs diversity in voice generation
- Repetition Penalty: Eliminate unwanted voice repetitions
Professional Workflow ๐ฏ
- Voice artists can now fine-tune voices like professional TTS systems
- Each character voice can have unique personality parameters
- Better control over audiobook consistency and quality
Future-Proof Architecture ๐
- Versioned voice profiles (v2.1) support new features
- Clean parameter passing through all generation functions
- Ready for additional TTS parameters in future updates
๐งช Testing Recommendations
- Create New Voice: Test all advanced parameters
- Load Old Voice: Verify backward compatibility
- Generate Audio: Confirm parameters affect output quality
- Multi-Voice: Test advanced parameters in character dialogue
- Volume + Advanced: Test combined normalization + advanced settings
โจ What Users See Now
When saving a voice, users get confirmation like:
โ
Voice profile 'Deep Male Narrator' saved successfully!
๐ Audio normalized from -12.3 dB to -18.0 dB
๐๏ธ Advanced settings: Min-P=0.03, Top-P=0.9, Rep. Penalty=1.3
When loading a voice profile, version info is shown:
โ
Loaded voice profile: Deep Male Narrator (v2.1)
The Voice Library now provides complete professional-grade TTS control! ๐