Spaces:

AnhP
/

RVC-GUI

Running

App Files Files Community

AnhP commited on 1 day ago

Commit

1e4a2ab

verified ·

1 Parent(s): 34528cb

Upload 170 files

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

LICENSE +21 -0
README.md +470 -0
assets/binary/decrypt.bin +3 -0
assets/binary/world.bin +3 -0
assets/f0/.gitattributes +0 -0
assets/ico.png +3 -0
assets/languages/en-US.json +663 -0
assets/languages/vi-VN.json +663 -0
assets/logs/mute/energy/mute.wav.npy +3 -0
assets/logs/mute/f0/mute.wav.npy +3 -0
assets/logs/mute/f0_voiced/mute.wav.npy +3 -0
assets/logs/mute/sliced_audios/mute32000.wav +3 -0
assets/logs/mute/sliced_audios/mute40000.wav +3 -0
assets/logs/mute/sliced_audios/mute48000.wav +3 -0
assets/logs/mute/sliced_audios_16k/mute.wav +0 -0
assets/logs/mute/v1_extracted/mute.npy +3 -0
assets/logs/mute/v1_extracted/mute_chinese.npy +3 -0
assets/logs/mute/v1_extracted/mute_japanese.npy +3 -0
assets/logs/mute/v1_extracted/mute_korean.npy +3 -0
assets/logs/mute/v1_extracted/mute_portuguese.npy +3 -0
assets/logs/mute/v1_extracted/mute_spin.npy +3 -0
assets/logs/mute/v1_extracted/mute_vietnamese.npy +3 -0
assets/logs/mute/v2_extracted/mute.npy +3 -0
assets/logs/mute/v2_extracted/mute_chinese.npy +3 -0
assets/logs/mute/v2_extracted/mute_japanese.npy +3 -0
assets/logs/mute/v2_extracted/mute_korean.npy +3 -0
assets/logs/mute/v2_extracted/mute_portuguese.npy +3 -0
assets/logs/mute/v2_extracted/mute_spin.npy +3 -0
assets/logs/mute/v2_extracted/mute_vietnamese.npy +3 -0
assets/models/embedders/.gitattributes +0 -0
assets/models/predictors/.gitattributes +0 -0
assets/models/pretrained_custom/.gitattributes +0 -0
assets/models/pretrained_v1/.gitattributes +0 -0
assets/models/pretrained_v2/.gitattributes +0 -0
assets/models/speaker_diarization/assets/gpt2.tiktoken +0 -0
assets/models/speaker_diarization/assets/mel_filters.npz +3 -0
assets/models/speaker_diarization/assets/multilingual.tiktoken +0 -0
assets/models/speaker_diarization/models/.gitattributes +0 -0
assets/models/uvr5/.gitattributes +0 -0
assets/presets/.gitattributes +0 -0
assets/weights/.gitattributes +0 -0
audios/.gitattributes +0 -0
dataset/.gitattributes +0 -0
main/app/app.py +87 -0
main/app/core/downloads.py +187 -0
main/app/core/editing.py +96 -0
main/app/core/f0_extract.py +54 -0
main/app/core/inference.py +387 -0
main/app/core/model_utils.py +162 -0
main/app/core/presets.py +165 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 Phạm Huỳnh Anh
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md ADDED Viewed

	@@ -0,0 +1,470 @@

+<div align="center">
+<img alt="LOGO" src="assets/ico.png" width="300" height="300" />
+# Vietnamese RVC BY ANH
+Công cụ chuyển đổi giọng nói chất lượng và hiệu suất cao đơn giản.
+[![Vietnamese RVC](https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white)](https://github.com/PhamHuynhAnh16/Vietnamese-RVC)
+[![Open In Colab](https://img.shields.io/badge/Colab-F9AB00?style=for-the-badge&logo=googlecolab&color=525252)](https://colab.research.google.com/github/PhamHuynhAnh16/Vietnamese-RVC-ipynb/blob/main/Vietnamese-RVC.ipynb)
+[![Licence](https://img.shields.io/badge/LICENSE-MIT-green?style=for-the-badge)](https://github.com/PhamHuynhAnh16/Vietnamese-RVC/blob/main/LICENSE)
+</div>
+<div align="center">
+[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/AnhP/RVC-GUI)
+[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97-Models-blue)](https://huggingface.co/AnhP/Vietnamese-RVC-Project)
+</div>
+# Mô tả
+Dự án này là một công cụ chuyển đổi giọng nói đơn giản, dễ sử dụng. Với mục tiêu tạo ra các sản phẩm chuyển đổi giọng nói chất lượng cao và hiệu suất tối ưu, dự án cho phép người dùng thay đổi giọng nói một cách mượt mà, tự nhiên.
+# Các tính năng của dự án
+- Tách nhạc (MDX-Net/Demucs)
+- Chuyển đổi giọng nói (Chuyển đổi tệp / Chuyển đổi hàng loạt / Chuyển đổi với Whisper / Chuyển đổi văn bản)
+- Áp dụng hiệu ứng cho âm thanh
+- Tạo dữ liệu huấn luyện (Từ đường dẫn liên kết)
+- Huấn luyện mô hình (v1/v2, bộ mã hóa chất lượng cao, huấn luyện năng lượng)
+- Dung hợp mô hình
+- Đọc thông tin mô hình
+- Xuất mô hình sang ONNX
+- Tải xuống từ kho mô hình có sẳn
+- Tìm kiếm mô hình từ web
+- Trích xuất cao độ
+- Hỗ trợ suy luận chuyển đổi âm thanh bằng mô hình ONNX
+- Mô hình ONNX RVC cũng sẽ hỗ trợ chỉ mục để suy luận
+**Phương thức trích xuất cao độ: `pm-ac, pm-cc, pm-shs, dio, mangio-crepe-tiny, mangio-crepe-small, mangio-crepe-medium, mangio-crepe-large, mangio-crepe-full, crepe-tiny, crepe-small, crepe-medium, crepe-large, crepe-full, fcpe, fcpe-legacy, rmvpe, rmvpe-legacy, harvest, yin, pyin, swipe, piptrack, fcn`**
+**Các mô hình trích xuất nhúng: `contentvec_base, hubert_base, vietnamese_hubert_base, japanese_hubert_base, korean_hubert_base, chinese_hubert_base, portuguese_hubert_base, spin`**
+- **Các mô hình trích xuất cao độ đều có phiên bản tăng tốc ONNX trừ các phương thức hoạt động bằng trình bao bọc.**
+- **Các mô hình trích xuất đều có thể kết hợp với nhau để tạo ra cảm giác mới mẻ, ví dụ: `hybrid[rmvpe+harvest]`.**
+- **Các mô hình trích xuất nhúng có sẳn các chế độ nhúng như: fairseq, onnx, transformers, spin.**
+# Hướng dẫn sử dụng
+**Sẽ có nếu tôi thực sự rảnh...**
+# Cài đặt
+Bước 1: Cài đặt các phần phụ trợ cần thiết
+- Cài đặt Python từ trang chủ: **[PYTHON](https://www.python.org/ftp/python/3.11.8/python-3.11.8-amd64.exe)** (Dự án đã được kiểm tra trên Python 3.10.x và 3.11.x)
+- Cài đặt FFmpeg từ nguồn và thêm vào PATH hệ thống: **[FFMPEG](https://github.com/BtbN/FFmpeg-Builds/releases)**
+Bước 2: Cài đặt dự án (Dùng Git hoặc đơn giản là tải trên github)
+Sử dụng đối với Git:
+- git clone https://github.com/PhamHuynhAnh16/Vietnamese-RVC.git
+- cd Vietnamese-RVC
+Cài đặt bằng github:
+- Vào https://github.com/PhamHuynhAnh16/Vietnamese-RVC
+- Nhấn vào `<> Code` màu xanh lá chọn `Download ZIP`
+- Giải nén `Vietnamese-RVC-main.zip`
+- Vào thư mục Vietnamese-RVC-main chọn vào thanh Path nhập `cmd` và nhấn Enter
+Bước 3: Cài đặt thư viện cần thiết:
+Nhập lệnh:
+```
+python -m venv env
+env\\Scripts\\activate
+```
+Đối với CPU:
+```
+python -m pip install -r requirements.txt
+```
+Đối với CUDA (Có thể thay cu118 thành bản cu128 mới hơn nếu GPU hỗ trợ):
+```
+python -m pip install torch torchaudio torchvision --index-url https://download.pytorch.org/whl/cu118
+python -m pip install -r requirements.txt
+```
+Đối với AMD:
+```
+python -m pip install torch==2.6.0 torchaudio==2.6.0 torchvision
+python -m pip install https://github.com/artyom-beilis/pytorch_dlprim/releases/download/0.2.0/pytorch_ocl-0.2.0+torch2.6-cp311-none-win_amd64.whl
+python -m pip install onnxruntime-directml
+python -m pip install -r requirements.txt
+```
+Lưu ý đối với AMD:
+- Chỉ cài đặt AMD trên python 3.11 vì DLPRIM không có bản cho python 3.10.
+- Demucs có thể gây quá tải và tràn bộ nhớ đối với GPU (nếu cần sử dụng demucs hãy mở tệp config.json trong main\configs sửa đối số demucs_cpu_mode thành true).
+- DDP không hỗ trợ huấn luyện đa GPU đối với OPENCL (AMD).
+- Một số thuật toán khác phải chạy trên cpu nên có thể hiệu suất của GPU có thể không sử dụng hết.
+# Sử dụng
+**Sử dụng với Google Colab**
+- Mở Google Colab: [Vietnamese-RVC](https://colab.research.google.com/github/PhamHuynhAnh16/Vietnamese-RVC-ipynb/blob/main/Vietnamese-RVC.ipynb)
+- Bước 1: Chạy ô Cài đặt và đợi nó hoàn tất.
+- Bước 2: Chạy ô Mở giao diện sử dụng (Khi này giao diện sẽ in ra 2 đường dẫn 1 là 0.0.0.0.7680 và 1 đường dẫn gradio có thể nhấp được, bạn chọn vào đường dẫn nhấp được và nó sẽ đưa bạn đến giao diện).
+**Chạy tệp run_app để mở giao diện sử dụng, chạy tệp tensorboard để mở biểu đồ kiểm tra huấn luyện. (Lưu ý: không tắt Command Prompt hoặc Terminal)**
+```
+run_app.bat / tensorboard.bat
+```
+**Khởi động giao diện sử dụng. (Thêm `--allow_all_disk` vào lệnh để cho phép gradio truy cập tệp ngoài)**
+```
+env\\Scripts\\python.exe main\\app\\app.py --open
+```
+**Với trường hợp bạn sử dụng Tensorboard để kiểm tra huấn luyện**
+```
+env\\Scripts\\python.exe main/app/run_tensorboard.py
+```
+**Sử dụng bằng cú pháp**
+```
+python main\\app\\parser.py --help
+```
+# Cài đặt, sử dụng đơn giản
+**Cài đặt phiên bản releases từ [Vietnamese_RVC](https://github.com/PhamHuynhAnh16/Vietnamese-RVC/releases)**
+- Chọn bản đúng với bạn và tải về máy.
+- Giải nén dự án.
+- Chạy tệp run_app.bat để mở giao diện hoạt động.
+# Cấu trúc chính của mã nguồn:
+<pre>
+Vietnamese-RVC-main
+├── assets
+│   ├── binary
+│   │   ├── decrypt.bin
+│   │   └── world.bin
+│   ├── f0
+│   ├── languages
+│   │   ├── en-US.json
+│   │   └── vi-VN.json
+│   ├── logs
+│   │   └── mute
+│   │       ├── energy
+│   │       │   └── mute.wav.npy
+│   │       ├── f0
+│   │       │   └── mute.wav.npy
+│   │       ├── f0_voiced
+│   │       │   └── mute.wav.npy
+│   │       ├── sliced_audios
+│   │       │   ├── mute32000.wav
+│   │       │   ├── mute40000.wav
+│   │       │   └── mute48000.wav
+│   │       ├── sliced_audios_16k
+│   │       │   └── mute.wav
+│   │       ├── v1_extracted
+│   │       │   ├── mute.npy
+│   │       │   ├── mute_chinese.npy
+│   │       │   ├── mute_japanese.npy
+│   │       │   ├── mute_korean.npy
+│   │       │   ├── mute_portuguese.npy
+│   │       │   ├── mute_vietnamese.npy
+│   │       │   └── mute_spin.npy
+│   │       └── v2_extracted
+│   │           ├── mute.npy
+│   │           ├── mute_chinese.npy
+│   │           ├── mute_japanese.npy
+│   │           ├── mute_korean.npy
+│   │           ├── mute_portuguese.npy
+│   │           ├── mute_vietnamese.npy
+│   │           └── mute_spin.npy
+│   ├── models
+│   │   ├── embedders
+│   │   ├── predictors
+│   │   ├── pretrained_custom
+│   │   ├── pretrained_v1
+│   │   ├── pretrained_v2
+│   │   ├── speaker_diarization
+│   │   │   ├── assets
+│   │   │   │   ├── gpt2.tiktoken
+│   │   │   │   ├── mel_filters.npz
+│   │   │   │   └── multilingual.tiktoken
+│   │   │   └── models
+│   │   └── uvr5
+│   ├── presets
+│   ├── weights
+│   └── ico.png
+├── audios
+├── dataset
+├── main
+│   ├── app
+│   │   ├── core
+│   │   │   ├── downloads.py
+│   │   │   ├── editing.py
+│   │   │   ├── f0_extract.py
+│   │   │   ├── inference.py
+│   │   │   ├── model_utils.py
+│   │   │   ├── presets.py
+│   │   │   ├── process.py
+│   │   │   ├── restart.py
+│   │   │   ├── separate.py
+│   │   │   ├── training.py
+│   │   │   ├── tts.py
+│   │   │   ├── ui.py
+│   │   │   └── utils.py
+│   │   ├── tabs
+│   │   │   ├── downloads
+│   │   │   │   └── downloads.py
+│   │   │   ├── editing
+│   │   │   │   ├── editing.py
+│   │   │   │   └── child
+│   │   ���   │       ├── audio_effects.py
+│   │   │   │       └── quirk.py
+│   │   │   ├── extra
+│   │   │   │   ├── extra.py
+│   │   │   │   └── child
+│   │   │   │       ├── convert_model.py
+│   │   │   │       ├── f0_extract.py
+│   │   │   │       ├── fushion.py
+│   │   │   │       ├── read_model.py
+│   │   │   │       ├── report_bugs.py
+│   │   │   │       └── settings.py
+│   │   │   ├── inference
+│   │   │   │   ├── inference.py
+│   │   │   │   └── child
+│   │   │   │       ├── convert.py
+│   │   │   │       ├── convert_tts.py
+│   │   │   │       ├── convert_with_whisper.py
+│   │   │   │       └── separate.py
+│   │   │   └── training
+│   │   │       ├── training.py
+│   │   │       └── child
+│   │   │           ├── create_dataset.py
+│   │   │           └── training.py
+│   │   ├── app.py
+│   │   ├── parser.py
+│   │   ├── run_tensorboard.py
+│   │   └── variables.py
+│   ├── configs
+│   │   ├── config.json
+│   │   ├── config.py
+│   │   ├── v1
+│   │   │   ├── 32000.json
+│   │   │   ├── 40000.json
+│   │   │   └── 48000.json
+│   │   └── v2
+│   │       ├── 32000.json
+│   │       ├── 40000.json
+│   │       └── 48000.json
+│   ├── inference
+│   │   ├── audio_effects.py
+│   │   ├── create_dataset.py
+│   │   ├── create_index.py
+│   │   ├── separator_music.py
+│   │   ├── extracting
+│   │   │   ├── embedding.py
+│   │   │   ├── extract.py
+│   │   │   ├── feature.py
+│   │   │   ├── preparing_files.py
+│   │   │   ├── rms.py
+│   │   │   └── setup_path.py
+│   │   ├── training
+│   │   │   ├── train.py
+│   │   │   ├── data_utils.py
+│   │   │   ├── losses.py
+│   │   │   ├── mel_processing.py
+│   │   │   └── utils.py
+│   │   ├── conversion
+│   │   │   ├── convert.py
+│   │   │   ├── pipeline.py
+│   │   │   └── utils.py
+│   │   └── preprocess
+│   │       ├── preprocess.py
+│   │       └── slicer2.py
+│   ├── library
+│   │   ├── utils.py
+│   │   ├── opencl.py
+│   │   ├── algorithm
+│   │   │   ├── attentions.py
+│   │   │   ├── commons.py
+│   │   │   ├── discriminators.py
+│   │   │   ├── encoders.py
+│   │   │   ├── modules.py
+│   │   │   ├── normalization.py
+│   │   │   ├── onnx_export.py
+│   │   │   ├── residuals.py
+│   │   │   ├── stftpitchshift.py
+│   │   │   └── synthesizers.py
+│   │   ├── architectures
+│   │   │   ├── demucs_separator.py
+│   │   │   ├── fairseq.py
+│   │   │   └── mdx_separator.py
+│   │   ├── generators
+│   │   │   ├── hifigan.py
+│   │   │   ├── mrf_hifigan.py
+│   │   │   ├── nsf_hifigan.py
+│   │   │   └── refinegan.py
+│   │   ├── predictors
+│   │   │   ├── CREPE
+│   │   │   │   ├── CREPE.py
+│   │   │   │   ├── filter.py
+│   │   │   │   └── model.py
+│   │   │   ├── FCN
+│   │   │   │   ├── FCN.py
+│   │   │   │   ├── convert.py
+│   │   │   │   └── model.py
+│   │   │   ├── FCPE
+│   │   │   │   ├── attentions.py
+│   │   │   │   ├── encoder.py
+│   │   │   │   ├── FCPE.py
+│   │   │   │   ├── stft.py
+│   │   │   │   ├── utils.py
+│   │   │   │   └── wav2mel.py
+│   │   │   ├── RMVPE
+│   │   │   │   ├── RMVPE.py
+│   │   │   │   ├── deepunet.py
+│   │   │   │   ├── e2e.py
+│   │   │   │   └── mel.py
+│   │   │   ├── WORLD
+│   │   │   │   ├── WORLD.py
+│   │   │   │   └── SWIPE.py
+│   │   │   └── Generator.py
+│   │   ├── speaker_diarization
+│   │   │   ├── audio.py
+│   │   │   ├── ECAPA_TDNN.py
+│   │   │   ├── embedding.py
+│   │   │   ├── encoder.py
+│   │   │   ├── features.py
+│   │   │   ├── parameter_transfer.py
+│   │   │   ├── segment.py
+│   │   │   ├── speechbrain.py
+│   │   │   └── whisper.py
+���   │   └── uvr5_lib
+│   │       ├── common_separator.py
+│   │       ├── separator.py
+│   │       ├── spec_utils.py
+│   │       └── demucs
+│   │           ├── apply.py
+│   │           ├── demucs.py
+│   │           ├── hdemucs.py
+│   │           ├── htdemucs.py
+│   │           ├── states.py
+│   │           └── utils.py
+│   └── tools
+│       ├── gdown.py
+│       ├── huggingface.py
+│       ├── mediafire.py
+│       ├── meganz.py
+│       ├── noisereduce.py
+│       └── pixeldrain.py
+├── docker-compose-amd.yaml
+├── docker-compose-cpu.yaml
+├── docker-compose-cuda118.yaml
+├── docker-compose-cuda128.yaml
+├── Dockerfile
+├── Dockerfile.amd
+├── Dockerfile.cuda118
+├── Dockerfile.cuda128
+├── LICENSE
+├── README.md
+├── requirements.txt
+├── run_app.bat
+└── tensorboard.bat
+</pre>
+# LƯU Ý
+- **Hiện tại các bộ mã hóa mới như MRF HIFIGAN vẫn chưa đầy đủ các bộ huấn luyện trước**
+- **Bộ mã hóa MRF HIFIGAN và REFINEGAN không hỗ trợ huấn luyện khi không không huấn luyện cao độ**
+- **Các mô hình trong kho lưu trữ Vietnamese-RVC được thu thập rải rác trên AI Hub, HuggingFace và các các kho lưu trữ khác. Có thể mang các giấy phép bản quyền khác nhau**
+# Tuyên bố miễn trừ trách nhiệm
+- **Dự án Vietnamese-RVC được phát triển với mục đích nghiên cứu, học tập và giải trí cá nhân. Tôi không khuyến khích cũng như không chịu trách nhiệm đối với bất kỳ hành vi lạm dụng công nghệ chuyển đổi giọng nói vì mục đích lừa đảo, giả mạo danh tính, hoặc vi phạm quyền riêng tư, bản quyền của bất kỳ cá nhân hay tổ chức nào.**
+- **Người dùng cần tự chịu trách nhiệm với hành vi sử dụng phần mềm này và cam kết tuân thủ pháp luật hiện hành tại quốc gia nơi họ sinh sống hoặc hoạt động.**
+- **Việc sử dụng giọng nói của người nổi tiếng, người thật hoặc nhân vật công chúng phải có sự cho phép hoặc đảm bảo không vi phạm pháp luật, đạo đức và quyền lợi của các bên liên quan.**
+- **Tác giả của dự án không chịu trách nhiệm pháp lý đối với bất kỳ hậu quả nào phát sinh từ việc sử dụng phần mềm này.**
+# Điều khoản sử dụng
+- Bạn phải đảm bảo rằng các nội dung âm thanh bạn tải lên và chuyển đổi qua dự án này không vi phạm quyền sở hữu trí tuệ của bên thứ ba.
+- Không được phép sử dụng dự án này cho bất kỳ hoạt động nào bất hợp pháp, bao gồm nhưng không giới hạn ở việc sử dụng để lừa đảo, quấy rối, hay gây tổn hại đến người khác.
+- Bạn chịu trách nhiệm hoàn toàn đối với bất kỳ thiệt hại nào phát sinh từ việc sử dụng sản phẩm không đúng cách.
+- Tôi sẽ không chịu trách nhiệm với bất kỳ thiệt hại trực tiếp hoặc gián tiếp nào phát sinh từ việc sử dụng dự án này.
+# Dự án này được xây dựng dựa trên các dự án như sau
+|                                                            Tác Phẩm                                                            |         Tác Giả         |  Giấy Phép  |
+|--------------------------------------------------------------------------------------------------------------------------------|-------------------------|-------------|
+| **[Applio](https://github.com/IAHispano/Applio/tree/main)**                                                                    | IAHispano               | MIT License |
+| **[Python-audio-separator](https://github.com/nomadkaraoke/python-audio-separator/tree/main)**                                 | Nomad Karaoke           | MIT License |
+| **[Retrieval-based-Voice-Conversion-WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/tree/main)**  | RVC Project             | MIT License |
+| **[RVC-ONNX-INFER-BY-Anh](https://github.com/PhamHuynhAnh16/RVC_Onnx_Infer)**                                                  | Phạm Huỳnh Anh          | MIT License |
+| **[Torch-Onnx-Crepe-By-Anh](https://github.com/PhamHuynhAnh16/TORCH-ONNX-CREPE)**                                              | Phạm Huỳnh Anh          | MIT License |
+| **[Hubert-No-Fairseq](https://github.com/PhamHuynhAnh16/hubert-no-fairseq)**                                                   | Phạm Huỳnh Anh          | MIT License |
+| **[Local-attention](https://github.com/lucidrains/local-attention)**                                                           | Phil Wang               | MIT License |
+| **[TorchFcpe](https://github.com/CNChTu/FCPE/tree/main)**                                                                      | CN_ChiTu                | MIT License |
+| **[FcpeONNX](https://github.com/deiteris/voice-changer/blob/master-custom/server/utils/fcpe_onnx.py)**                         | Yury                    | MIT License |
+| **[ContentVec](https://github.com/auspicious3000/contentvec)**                                                                 | Kaizhi Qian             | MIT License |
+| **[Mediafiredl](https://github.com/Gann4Life/mediafiredl)**                                                                    | Santiago Ariel Mansilla | MIT License |
+| **[Noisereduce](https://github.com/timsainb/noisereduce)**                                                                     | Tim Sainburg            | MIT License |
+| **[World.py-By-Anh](https://github.com/PhamHuynhAnh16/world.py)**                                                              | Phạm Huỳnh Anh          | MIT License |
+| **[Mega.py](https://github.com/3v1n0/mega.py)**                                                                                | Marco Trevisan          | No License  |
+| **[Gdown](https://github.com/wkentaro/gdown)**                                                                                 | Kentaro Wada            | MIT License |
+| **[Whisper](https://github.com/openai/whisper)**                                                                               | OpenAI                  | MIT License |
+| **[PyannoteAudio](https://github.com/pyannote/pyannote-audio)**                                                                | pyannote                | MIT License |
+| **[AudioEditingCode](https://github.com/HilaManor/AudioEditingCode)**                                                          | Hila Manor              | MIT License |
+| **[StftPitchShift](https://github.com/jurihock/stftPitchShift)**                                                               | Jürgen Hock             | MIT License |
+| **[Codename-RVC-Fork-3](https://github.com/codename0og/codename-rvc-fork-3)**                                                  | Codename;0              | MIT License |
+| **[Penn](https://github.com/interactiveaudiolab/penn)**                                                                        | Interactive Audio Lab   | MIT License |
+# Kho mô hình của công cụ tìm kiếm mô hình
+- **[VOICE-MODELS.COM](https://voice-models.com/)**
+# Các phương pháp trích xuất F0 trong RVC
+Tài liệu này trình bày chi tiết các phương pháp trích xuất cao độ được sử dụng, thông tin về ưu, nhược điểm, sức mạnh và độ tin cậy của từng phương pháp theo trải nghiệm cá nhân.
+| Phương pháp        |      Loại      |          Ưu điểm          |            Hạn chế           |      Sức mạnh      |     Độ tin cậy     |
+|--------------------|----------------|---------------------------|------------------------------|--------------------|--------------------|
+| pm                 | Praat          | Nhanh                     | Kém chính xác                | Thấp               | Thấp               |
+| dio                | PYWORLD        | Thích hợp với Rap         | Kém chính xác với tần số cao | Trung bình         | Trung bình         |
+| harvest            | PYWORLD        | Chính xác hơn DIO         | Xử lý chậm hơn               | Cao                | Rất cao            |
+| crepe              | Deep Learning  | Chính xác cao             | Yêu cầu GPU                  | Rất cao            | Rất cao            |
+| mangio-crepe       | crepe nofilter | Tối ưu hóa cho RVC        | Đôi khi kém crepe gốc        | Trung bình đến cao | Trung bình đến cao |
+| fcpe               | Deep Learning  | Chính xác, thời gian thực | Cần GPU mạnh                 | Khá                | Trung bình         |
+| fcpe-legacy        | Old            | Chính xác, thời gian thực | Cũ hơn                       | Khá                | Trung bình         |
+| rmvpe              | Deep Learning  | Hiệu quả với giọng hát    | Tốn tài nguyên               | Rất cao            | Xuất sắc           |
+| rmvpe-legacy       | Old            | Tính toán với Fmin-max    | Cũ hơn                       | Cao                | Khá                |
+| yin                | Librosa        | Đơn giản, hiệu quả        | Dễ lỗi bội                   | Trung bình         | Thấp               |
+| pyin               | Librosa        | Ổn định hơn YIN           | Tính toán phức tạp hơn       | Khá                | Khá                |
+| swipe              | WORLD          | Độ chính xác cao          | Nhạy cảm với nhiễu           | Cao                | Khá                |
+| piptrack           | Librosa        | Nhanh                     | Kém chính xác                | Thấp               | Thấp               |
+| fcn                | Deep Learning  | Không Rõ                  | F0 Thấp                      | Không Rõ           | Không Rõ           |
+# Báo cáo lỗi
+- **Với trường hợp gặp lỗi khi sử dụng mã nguồn này tôi thực sự xin lỗi bạn vì trải nghiệm không tốt này, bạn có thể gửi báo cáo lỗi thông qua cách phía dưới**
+- **Bạn có thể báo cáo lỗi cho tôi thông qua hệ thống báo cáo lỗi webhook trong giao diện sử dụng**
+- **Với trường hợp hệ thống báo cáo lỗi không hoạt động bạn có thể báo cáo lỗi cho tôi thông qua Discord `pham_huynh_anh` Hoặc [ISSUE](https://github.com/PhamHuynhAnh16/Vietnamese-RVC/issues)**
+# ☎️ Liên hệ tôi
+- Discord: **pham_huynh_anh**

assets/binary/decrypt.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:330268cbf6b9317a76510b533e1640ef48ed074a07c013e5b1abc4d48cfd9dce
+size 32

assets/binary/world.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:49520c26e725e1d71a4ee4361fd1e41a12ec67b59912821f5123dce6eb572c16
+size 3481870

assets/f0/.gitattributes ADDED Viewed

File without changes

assets/ico.png ADDED Viewed

Git LFS Details

SHA256: 3580dfee1d9b4c8ed32870bb798a36d50e6586a0872a1da9bdbe4c3ca425b7f6
Pointer size: 132 Bytes
Size of remote file: 3.95 MB

assets/languages/en-US.json ADDED Viewed

	@@ -0,0 +1,663 @@

+{
+    "set_lang": "Display language set to {lang}.",
+    "no_support_gpu": "Unfortunately, no compatible GPU is available to support your training.",
+    "text": "text",
+    "upload_success": "File {name} uploaded successfully.",
+    "download_url": "Download from the link",
+    "download_from_csv": "Download from the CSV model repository",
+    "search_models": "Search models",
+    "upload": "Upload",
+    "option_not_valid": "Invalid option!",
+    "list_model": "Model list",
+    "success": "Completed!",
+    "index": "index",
+    "model": "model",
+    "zip": "compress",
+    "search": "search",
+    "provide_file": "Please provide a valid {filename} file!",
+    "start": "Starting {start}...",
+    "not_found": "Not found {name}.",
+    "found": "Found {results} results!",
+    "download_music": "download music",
+    "download": "download",
+    "provide_url": "Please provide a url.",
+    "provide_name_is_save": "Please provide a model name to save.",
+    "not_support_url": "Your model url is not supported.",
+    "error_occurred": "An error occurred: {e}.",
+    "unable_analyze_model": "Unable to analyze the model!",
+    "download_pretrain": "Downloading pre-trained model...",
+    "provide_pretrain": "Please provide a pre-trained model url {dg}.",
+    "sr_not_same": "The sample rates of the two models are not the same.",
+    "architectures_not_same": "Cannot merge models. The architectures are not the same.",
+    "fushion_model": "model fusion",
+    "model_fushion_info": "The model {name} is fused from {pth_1} and {pth_2} with a ratio of {ratio}.",
+    "not_found_create_time": "Creation time not found.",
+    "format_not_valid": "Invalid format.",
+    "read_info": "Models trained on different applications may produce different information or may not be readable!",
+    "epoch": "epoch.",
+    "step": "step",
+    "sr": "Sample rate",
+    "f0": "pitch training",
+    "version": "version.",
+    "not_f0": "Pitch training not performed",
+    "trained_f0": "Pitch training performed",
+    "model_info": "Model Name: {model_name}\n\n Model Creator: {model_author}\n\nEpoch: {epochs}\n\nSteps: {steps}\n\nVersion: {version}\n\nSample Rate: {sr}\n\nPitch Training: {pitch_guidance}\n\nHash (ID): {model_hash}\n\nCreation Time: {creation_date_str}\n\nVocoder: {vocoder}\n\nEnergy: {rms_extract}\n",
+    "input_not_valid": "Please provide valid input!",
+    "output_not_valid": "Please provide valid output!",
+    "apply_effect": "apply effect",
+    "enter_the_text": "Please enter the text to speech!",
+    "choose_voice": "Please choose a voice!",
+    "convert": "Converting {name}...",
+    "separator_music": "music separation",
+    "notfound": "Not found",
+    "turn_on_use_audio": "Please enable using separated audio to proceed",
+    "turn_off_convert_backup": "Disable backup voice conversion to use the original voice",
+    "turn_off_merge_backup": "Disable merging backup voice to use the original voice",
+    "not_found_original_vocal": "Original vocal not found!",
+    "convert_vocal": "Converting voice...",
+    "convert_success": "Voice conversion completed!",
+    "convert_backup": "Converting backup voice...",
+    "convert_backup_success": "Backup voice conversion completed!",
+    "merge_backup": "Merging main voice with backup voice...",
+    "merge_success": "Merge completed.",
+    "is_folder": "Input is a folder: Converting all audio files in the folder...",
+    "not_found_in_folder": "No audio files found in the folder!",
+    "batch_convert": "Batch conversion in progress...",
+    "batch_convert_success": "Batch conversion successful!",
+    "create": "create",
+    "provide_name": "Please provide a model name.",
+    "not_found_data": "Data not found",
+    "not_found_data_preprocess": "Processed audio data not found, please reprocess.",
+    "not_found_data_extract": "Extracted audio data not found, please re-extract.",
+    "provide_pretrained": "Please provide pre-trained {dg}.",
+    "download_pretrained": "Download pre-trained {dg}{rvc_version} original",
+    "not_found_pretrain": "Pre-trained {dg} not found",
+    "not_use_pretrain": "No pre-trained model will be used",
+    "training": "training",
+    "rick_roll": "Click here if you want to be Rick Roll :) ---> [RickRoll]({rickroll})",
+    "terms_of_use": "**Please do not use the project for any unethical, illegal, or harmful purposes to individuals or organizations...**",
+    "exemption": "**In cases where users do not comply with the terms or violate them, I will not be responsible for any claims, damages, or liabilities, whether in contract, negligence, or other causes arising from, outside of, or related to the software, its use, or other transactions associated with it.**",
+    "separator_tab": "Music Separation",
+    "4_part": "A simple music separation system can separate into 4 parts: Instruments, Vocals, Main vocals, Backup vocals",
+    "clear_audio": "Clean audio",
+    "separator_backing": "Separate backup vocals",
+    "denoise_mdx": "Denoise MDX separation",
+    "use_mdx": "Use MDX",
+    "dereveb_audio": "Remove vocal reverb",
+    "dereveb_backing": "Remove backup reverb",
+    "separator_model": "Music separation model",
+    "separator_backing_model": "Backup separation model",
+    "shift": "Shift",
+    "shift_info": "Higher is better quality but slower and uses more resources",
+    "segments_size": "Segments Size",
+    "segments_size_info": "Higher is better quality but uses more resources",
+    "batch_size": "Batch size",
+    "batch_size_info": "Number of samples processed simultaneously in one training cycle. Higher can cause memory overflow",
+    "mdx_batch_size_info": "Number of samples processed at a time. Batch processing optimizes calculations. Large batches can cause memory overflow; small batches reduce resource efficiency",
+    "overlap": "Overlap",
+    "overlap_info": "Overlap amount between prediction windows",
+    "export_format": "Export format",
+    "export_info": "The export format to export the audio file in",
+    "output_separator": "Separated output",
+    "hop_length_info": "Analyzing the time transfer window when performing transformations is allowed. The detailed value is compact but requires more calculation",
+    "drop_audio": "Drop audio here",
+    "drop_text": "Drop text file here",
+    "use_url": "YouTube link",
+    "url_audio": "Link audio",
+    "downloads": "Downloads",
+    "clean_strength": "Audio cleaning strength",
+    "clean_strength_info": "Strength of the audio cleaner for filtering vocals during export",
+    "input_output": "Audio input, output",
+    "audio_path": "Input audio path",
+    "refresh": "Refresh",
+    "output_folder": "Output audio folder path",
+    "output_folder_info": "Enter the folder path where the audio will be exported",
+    "input_audio": "Audio input",
+    "instruments": "Instruments",
+    "original_vocal": "Original vocal",
+    "main_vocal": "Main vocal",
+    "backing_vocal": "Backup vocal",
+    "convert_audio": "Convert Audio",
+    "convert_info": "Convert audio using a trained voice model",
+    "autotune": "Auto-tune",
+    "use_audio": "Use separated audio",
+    "convert_original": "Convert original voice",
+    "convert_backing": "Convert backup voice",
+    "not_merge_backing": "Do not merge backup voice",
+    "merge_instruments": "Merge instruments",
+    "pitch": "Pitch",
+    "pitch_info": "Recommendation: set to 12 to change male voice to female and vice versa",
+    "model_accordion": "Model and index",
+    "model_name": "Model file",
+    "index_path": "Index file",
+    "index_strength": "Index strength",
+    "index_strength_info": "Higher values increase strength. However, lower values may reduce artificial effects in the audio",
+    "output_path": "Audio output path",
+    "output_path_info": "Enter the output path (leave it as .wav format; it will auto-correct during conversion)",
+    "setting": "General settings",
+    "f0_method": "Extraction method",
+    "f0_method_info": "Method used for data extraction",
+    "f0_method_hybrid": "HYBRID extraction method",
+    "f0_method_hybrid_info": "Combination of two or more different types of extracts",
+    "hubert_model": "Embedding model",
+    "hubert_info": "Pre-trained model to assist embedding",
+    "modelname": "Model name",
+    "modelname_info": "If you have your own model, just upload it and input the name here",
+    "split_audio": "Split audio",
+    "autotune_rate": "Auto-tune rate",
+    "autotune_rate_info": "Level of auto-tuning adjustment",
+    "resample": "Resample",
+    "resample_info": "Resample post-processing to the final sample rate; 0 means no resampling, NOTE: SOME FORMATS DO NOT SUPPORT SPEEDS OVER 48000",
+    "filter_radius": "Filter radius",
+    "filter_radius_info": "If greater than three, median filtering is applied. The value represents the filter radius and can reduce breathiness or noise.",
+    "rms_mix_rate": "RMS Mix Rate",
+    "rms_mix_rate_info": "Determines the blend ratio between the RMS energy of the original voice and the converted voice",
+    "protect": "Consonant protection",
+    "protect_info": "Protect distinct consonants and breathing sounds to prevent audio tearing and other artifacts. Increasing this value provides comprehensive protection. Reducing it may reduce protection but also minimize indexing effects",
+    "output_convert": "Converted audio",
+    "main_convert": "Convert main voice",
+    "main_or_backing": "Main voice + Backup voice",
+    "voice_or_instruments": "Voice + Instruments",
+    "convert_text": "Convert Text",
+    "convert_text_markdown": "## Convert Text to Speech",
+    "convert_text_markdown_2": "Convert text to speech and read aloud using the trained voice model",
+    "input_txt": "Input data from a text file (.txt)",
+    "text_to_speech": "Text to read",
+    "voice_speed": "Reading speed",
+    "voice_speed_info": "Speed of the voice",
+    "tts_1": "1. Convert Text to Speech",
+    "tts_2": "2. Convert Speech",
+    "voice": "Voices by country",
+    "output_tts": "Output speech path",
+    "output_tts_convert": "Converted speech output path",
+    "tts_output": "Enter the output path",
+    "output_tts_markdown": "Unconverted and converted audio",
+    "output_text_to_speech": "Generated speech from text-to-speech conversion",
+    "output_file_tts_convert": "Speech converted using the model",
+    "output_audio": "Audio output",
+    "provide_output": "Enter the output path",
+    "audio_effects": "Audio Effects",
+    "apply_audio_effects": "## Add Additional Audio Effects",
+    "audio_effects_edit": "Add effects to audio",
+    "reverb": "Reverb effect",
+    "chorus": "Chorus effect",
+    "delay": "Delay effect",
+    "more_option": "Additional options",
+    "phaser": "Phaser effect",
+    "compressor": "Compressor effect",
+    "apply": "Apply",
+    "reverb_freeze": "Freeze mode",
+    "reverb_freeze_info": "Create a continuous echo effect when this mode is enabled",
+    "room_size": "Room size",
+    "room_size_info": "Adjust the room space to create reverberation",
+    "damping": "Damping",
+    "damping_info": "Adjust the level of absorption to control the amount of reverberation",
+    "wet_level": "Reverb signal level",
+    "wet_level_info": "Adjust the level of the reverb signal effect",
+    "dry_level": "Original signal level",
+    "dry_level_info": "Adjust the level of the signal without effects",
+    "width": "Audio width",
+    "width_info": "Adjust the width of the audio space",
+    "chorus_depth": "Chorus depth",
+    "chorus_depth_info": "Adjust the intensity of the chorus to create a wider sound",
+    "chorus_rate_hz": "Frequency",
+    "chorus_rate_hz_info": "Adjust the oscillation speed of the chorus effect",
+    "chorus_mix": "Mix signals",
+    "chorus_mix_info": "Adjust the mix level between the original and the processed signal",
+    "chorus_centre_delay_ms": "Center delay (ms)",
+    "chorus_centre_delay_ms_info": "The delay time between stereo channels to create the chorus effect",
+    "chorus_feedback": "Feedback",
+    "chorus_feedback_info": "Adjust the amount of the effect signal fed back into the original signal",
+    "delay_seconds": "Delay time",
+    "delay_seconds_info": "Adjust the delay time between the original and the processed signal",
+    "delay_feedback": "Delay feedback",
+    "delay_feedback_info": "Adjust the amount of feedback signal, creating a repeating effect",
+    "delay_mix": "Delay signal mix",
+    "delay_mix_info": "Adjust the mix level between the original and delayed signal",
+    "fade": "Fade effect",
+    "bass_or_treble": "Bass and treble",
+    "limiter": "Threshold limiter",
+    "distortion": "Distortion effect",
+    "gain": "Audio gain",
+    "bitcrush": "Bit reduction effect",
+    "clipping": "Clipping effect",
+    "fade_in": "Fade-in effect (ms)",
+    "fade_in_info": "Time for the audio to gradually increase from 0 to normal level",
+    "fade_out": "Fade-out effect (ms)",
+    "fade_out_info": "the time it takes for the sound to fade from normal to zero",
+    "bass_boost": "Bass boost level (dB)",
+    "bass_boost_info": "amount of bass boost in audio track",
+    "bass_frequency": "Low-pass filter cutoff frequency (Hz)",
+    "bass_frequency_info": "frequencies are reduced. Low frequencies make the bass clearer",
+    "treble_boost": "Treble boost level (dB)",
+    "treble_boost_info": "high level of sound reinforcement in the audio track",
+    "treble_frequency": "High-pass filter cutoff frequency (Hz)",
+    "treble_frequency_info": "The frequency will be filtered out. The higher the frequency, the higher the sound will be retained.",
+    "limiter_threshold_db": "Limiter threshold",
+    "limiter_threshold_db_info": "Limit the maximum audio level to prevent it from exceeding the threshold",
+    "limiter_release_ms": "Release time",
+    "limiter_release_ms_info": "Time for the audio to return after being limited (Mili Seconds)",
+    "distortion_info": "Adjust the level of distortion to create a noisy effect",
+    "gain_info": "Adjust the volume level of the signal",
+    "clipping_threshold_db": "Clipping threshold",
+    "clipping_threshold_db_info": "Trim signals exceeding the threshold, creating a distorted sound",
+    "bitcrush_bit_depth": "Bit depth",
+    "bitcrush_bit_depth_info": "Reduce audio quality by decreasing bit depth, creating a distorted effect",
+    "phaser_depth": "Phaser depth",
+    "phaser_depth_info": "Adjust the depth of the effect, impacting its intensity",
+    "phaser_rate_hz": "Frequency",
+    "phaser_rate_hz_info": "Adjust the frequency of the phaser effect",
+    "phaser_mix": "Mix signal",
+    "phaser_mix_info": "Adjust the mix level between the original and processed signals",
+    "phaser_centre_frequency_hz": "Center frequency",
+    "phaser_centre_frequency_hz_info": "The center frequency of the phaser effect, affecting the adjusted frequencies",
+    "phaser_feedback": "Feedback",
+    "phaser_feedback_info": "Adjust the feedback level of the effect, creating a stronger or lighter phaser feel",
+    "compressor_threshold_db": "Compressor threshold",
+    "compressor_threshold_db_info": "The threshold level above which the audio will be compressed",
+    "compressor_ratio": "Compression ratio",
+    "compressor_ratio_info": "Adjust the level of audio compression when exceeding the threshold",
+    "compressor_attack_ms": "Attack time (ms)",
+    "compressor_attack_ms_info": "Time for compression to start taking effect after the audio exceeds the threshold",
+    "compressor_release_ms": "Release time",
+    "compressor_release_ms_info": "Time for the audio to return to normal after being compressed",
+    "create_dataset_url": "Link to audio (use commas for multiple links)",
+    "createdataset": "Create dataset",
+    "create_dataset_markdown": "## Create Dataset training from YouTube",
+    "create_dataset_markdown_2": "Process and create training datasets using YouTube links",
+    "denoise": "Denoise",
+    "skip": "Skip",
+    "model_ver": "Voice separation version",
+    "model_ver_info": "The model version for separating vocals",
+    "create_dataset_info": "Dataset creation information",
+    "output_data": "Dataset output",
+    "output_data_info": "Output data after creation",
+    "skip_start": "Skip beginning",
+    "skip_start_info": "Skip the initial seconds of the audio; use commas for multiple audios",
+    "skip_end": "Skip end",
+    "skip_end_info": "Skip the final seconds of the audio; use commas for multiple audios",
+    "training_model": "Train Model",
+    "training_markdown": "Train and build a voice model with a set of voice data",
+    "training_model_name": "Name of the model during training (avoid special characters or spaces)",
+    "sample_rate": "Sample rate",
+    "sample_rate_info": "Sample rate of the model",
+    "training_version": "Model version",
+    "training_version_info": "Version of the model during training",
+    "training_pitch": "Pitch Guidance",
+    "upload_dataset": "Upload dataset",
+    "preprocess_effect": "Post processing",
+    "clear_dataset": "Clean dataset",
+    "preprocess_info": "Preprocessing information",
+    "preprocess_button": "1. Processing",
+    "extract_button": "2. Extract",
+    "extract_info": "Data extraction information",
+    "total_epoch": "Total epochs",
+    "total_epoch_info": "Total training epochs",
+    "save_epoch": "Save frequency",
+    "save_epoch_info": "Frequency of saving the model during training to allow retraining",
+    "create_index": "Create index",
+    "index_algorithm": "Index algorithm",
+    "index_algorithm_info": "Algorithm for creating the index",
+    "custom_dataset": "Custom dataset folder",
+    "custom_dataset_info": "Custom dataset folder for training data",
+    "overtraining_detector": "Overtraining detector",
+    "overtraining_detector_info": "Check for overtraining during model training",
+    "cleanup_training": "Clean Up",
+    "cleanup_training_info": "Clean up and retrain from scratch",
+    "cache_in_gpu": "Cache in GPU",
+    "cache_in_gpu_info": "Store the model in GPU cache memory",
+    "dataset_folder": "Folder containing dataset",
+    "threshold": "Overtraining threshold",
+    "setting_cpu_gpu": "CPU/GPU settings",
+    "gpu_number": "Number of GPUs used",
+    "gpu_number_info": "The order number of GPUs used in training. (Note: AMD GPUs do not support multi-GPU training)",
+    "save_only_latest": "Save only the latest",
+    "save_only_latest_info": "Save only the latest D and G models",
+    "save_every_weights": "Save all models",
+    "save_every_weights_info": "Save all models after each epoch",
+    "gpu_info": "GPU information",
+    "gpu_info_2": "Information about the GPU used during training",
+    "cpu_core": "Number of CPU cores available",
+    "cpu_core_info": "Number of CPU cores used during training",
+    "not_use_pretrain_2": "Do not use pretraining",
+    "not_use_pretrain_info": "Do not use pre-trained models",
+    "custom_pretrain": "Custom pretraining",
+    "custom_pretrain_info": "Customize pre-training settings",
+    "pretrain_file": "Pre-trained model file {dg}",
+    "train_info": "Training information",
+    "export_model": "5. Export Model",
+    "zip_model": "2. Compress model",
+    "output_zip": "Output file after compression",
+    "model_path": "Model path",
+    "model_ratio": "Model ratio",
+    "model_ratio_info": "Adjusting towards one side will make the model more like that side",
+    "output_model_path": "Model output path",
+    "fushion": "Model Fusion",
+    "fushion_markdown": "## Fushion Two Models",
+    "fushion_markdown_2": "Combine two voice models into a single model",
+    "read_model": "Read Information",
+    "read_model_markdown": "## Read Model Information",
+    "read_model_markdown_2": "Retrieve recorded information within the model",
+    "drop_model": "Drop model here",
+    "readmodel": "Read model",
+    "model_path_info": "Enter the path to the model file",
+    "modelinfo": "Model Information",
+    "download_markdown": "## Download Model",
+    "download_markdown_2": "Download voice models, pre-trained models, and embedding models",
+    "model_download": "Download voice model",
+    "model_url": "Link to the model",
+    "30s": "Please wait about 30 seconds. The system will restart automatically!",
+    "model_download_select": "Choose a model download method",
+    "model_warehouse": "Model repository",
+    "get_model": "Retrieve model",
+    "name_to_search": "Name to search",
+    "search_2": "Search",
+    "select_download_model": "Choose a searched model (Click to select)",
+    "download_pretrained_2": "Download pre-trained model",
+    "pretrained_url": "Pre-trained model link {dg}",
+    "select_pretrain": "Choose pre-trained model",
+    "select_pretrain_info": "Choose a pre-trained model to download",
+    "pretrain_sr": "Model sample rate",
+    "drop_pretrain": "Drop pre-trained model {dg} here",
+    "settings": "Settings",
+    "settings_markdown": "## Additional Settings",
+    "settings_markdown_2": "Customize additional features of the project",
+    "lang": "Language",
+    "lang_restart": "The display language in the project (When changing the language, the system will automatically restart after 30 seconds to update)",
+    "change_lang": "Change Language",
+    "theme": "Theme",
+    "theme_restart": "Theme type displayed in the interface (When changing the theme, the system will automatically restart after 30 seconds to update)",
+    "theme_button": "Change Theme",
+    "change_light_dark": "Switch Light/Dark Mode",
+    "tensorboard_url": "Tensorboard URL",
+    "errors_loading_audio": "Error loading audio",
+    "apply_error": "An error occurred while applying effects: {e}",
+    "indexpath": "Index path",
+    "split_total": "Total parts split",
+    "process_audio_error": "An error occurred while processing the audio",
+    "merge_error": "An error occurred while merging audio",
+    "not_found_convert_file": "Processed file not found",
+    "convert_batch": "Batch conversion...",
+    "found_audio": "Found {audio_files} audio files for conversion.",
+    "not_found_audio": "No audio files found!",
+    "error_convert": "An error occurred during audio conversion: {e}",
+    "convert_batch_success": "Batch conversion completed successfully in {elapsed_time} seconds. Output {output_path}",
+    "convert_audio_success": "File {input_path} converted successfully in {elapsed_time} seconds. Output {output_path}",
+    "read_faiss_index_error": "An error occurred while reading the FAISS index: {e}",
+    "read_model_error": "Failed to load model: {e}",
+    "starting_download": "Starting download",
+    "version_not_valid": "Invalid vocal separation version",
+    "skip<audio": "Cannot skip as skip time is less than audio file length",
+    "skip>audio": "Cannot skip as skip time is greater than audio file length",
+    "=<0": "Skip time is less than or equal to 0 and has been skipped",
+    "skip_warning": "Skip duration ({seconds} seconds) exceeds audio length ({total_duration} seconds). Skipping.",
+    "download_success": "Download completed successfully",
+    "create_dataset_error": "An error occurred while creating the training dataset",
+    "create_dataset_success": "Training dataset creation completed in {elapsed_time} seconds",
+    "skip_start_audio": "Successfully skipped start of audio: {input_file}",
+    "skip_end_audio": "Successfully skipped end of audio: {input_file}",
+    "merge_audio": "Merged all parts containing audio",
+    "separator_process": "Separating vocals: {input}...",
+    "not_found_main_vocal": "Main vocal not found!",
+    "not_found_backing_vocal": "Backup vocal not found!",
+    "not_found_instruments": "Instruments not found",
+    "merge_instruments_process": "Merging vocals with instruments...",
+    "dereverb": "Removing vocal reverb",
+    "dereverb_success": "Successfully removed vocal reverb",
+    "save_index": "Index file saved",
+    "create_index_error": "An error occurred while creating the index",
+    "sr_not_16000": "Sample rate must be 16000",
+    "extract_file_error": "An error occurred while extracting the file",
+    "extract_f0_method": "Starting pitch extraction using {num_processes} cores with method {f0_method}...",
+    "extract_f0": "Pitch Extraction",
+    "extract_f0_success": "Pitch extraction completed in {elapsed_time} seconds.",
+    "NaN": "contains NaN values and will be ignored.",
+    "start_extract_hubert": "Starting Embedding extraction...",
+    "process_error": "An error occurred during processing",
+    "extract_hubert_success": "Embedding extraction completed in {elapsed_time} seconds.",
+    "export_process": "Model path",
+    "extract_error": "An error occurred during data extraction",
+    "extract_success": "Data extraction successful",
+    "start_preprocess": "Starting data preprocessing with {num_processes} cores...",
+    "not_integer": "Voice ID folder must be an integer; instead got",
+    "preprocess_success": "Preprocessing completed in {elapsed_time} seconds.",
+    "preprocess_model_success": "Preprocessing data for the model completed successfully",
+    "turn_on_dereverb": "Reverb removal for backup vocals requires enabling reverb removal",
+    "turn_on_separator_backing": "Backup vocal separation requires enabling vocal separation",
+    "backing_model_ver": "Backup vocal separation model version",
+    "clean_audio_success": "Audio cleaned successfully!",
+    "separator_error": "An error occurred during music separation",
+    "separator_success": "Music separation completed in {elapsed_time} seconds",
+    "separator_process_2": "Processing music separation",
+    "separator_success_2": "Music separation successful!",
+    "separator_process_backing": "Processing backup vocal separation",
+    "separator_process_backing_success": "Backup vocal separation successful!",
+    "process_original": "Processing original vocal reverb removal...",
+    "process_original_success": "Original vocal reverb removal successful!",
+    "process_main": "Processing main vocal reverb removal...",
+    "process_main_success": "Main vocal reverb removal successful!",
+    "process_backing": "Processing backup vocal reverb removal...",
+    "process_backing_success": "Backup vocal reverb removal successful!",
+    "save_every_epoch": "Save model after: ",
+    "total_e": "Total epochs: ",
+    "dorg": "Pre-trained G: {pretrainG} | Pre-trained D: {pretrainD}",
+    "training_f0": "Pitch Guidance",
+    "not_gpu": "No GPU detected, reverting to CPU (not recommended)",
+    "not_found_checkpoint": "Checkpoint file not found: {checkpoint_path}",
+    "save_checkpoint": "Reloaded checkpoint '{checkpoint_path}' (epoch {checkpoint_dict})",
+    "save_model": "Saved model '{checkpoint_path}' (epoch {iteration})",
+    "sr_does_not_match": "{sample_rate} Sample rate does not match target {sample_rate2} Sample rate",
+    "time_or_speed_training": "time={current_time} | training speed={elapsed_time_str}",
+    "savemodel": "Saved model '{model_dir}' (epoch {epoch} and step {step})",
+    "model_author": "Credit model to {model_author}",
+    "unregistered": "Model unregistered",
+    "not_author": "Model not credited",
+    "training_author": "Model creator name",
+    "training_author_info": "To credit the model, enter your name here",
+    "extract_model_error": "An error occurred while extracting the model",
+    "start_training": "Starting training",
+    "import_pretrain": "Loaded pre-trained model ({dg}) '{pretrain}'",
+    "not_using_pretrain": "No pre-trained model ({dg}) will be used",
+    "overtraining_find": "Overtraining detected at epoch {epoch} with smoothed generator loss {smoothed_value_gen} and smoothed discriminator loss {smoothed_value_disc}",
+    "best_epoch": "New best epoch {epoch} with smoothed generator loss {smoothed_value_gen} and smoothed discriminator loss {smoothed_value_disc}",
+    "success_training": "Training completed with {epoch} epochs, {global_step} steps, and {loss_gen_all} total generator loss.",
+    "training_info": "Lowest generator loss: {lowest_value_rounded} at epoch {lowest_value_epoch}, step {lowest_value_step}",
+    "model_training_info": "{model_name} | epoch={epoch} | step={global_step} | {epoch_recorder} | lowest value={lowest_value_rounded} (epoch {lowest_value_epoch} and step {lowest_value_step}) | remaining epochs for overtraining: g/total: {remaining_epochs_gen} d/total: {remaining_epochs_disc} | smoothed generator loss={smoothed_value_gen} | smoothed discriminator loss={smoothed_value_disc}",
+    "model_training_info_2": "{model_name} | epoch={epoch} | step={global_step} | {epoch_recorder} | lowest value={lowest_value_rounded} (epoch {lowest_value_epoch} and step {lowest_value_step})",
+    "model_training_info_3": "{model_name} | epoch={epoch} | step={global_step} | {epoch_recorder}",
+    "training_error": "An error occurred while training the model:",
+    "separator_info": "Initializing with output path: {output_dir}, output format: {output_format}",
+    "none_ffmpeg": "FFmpeg is not installed. Please install FFmpeg to use this package.",
+    "running_in_cpu": "Unable to configure hardware acceleration, running in CPU mode",
+    "running_in_cuda": "CUDA available in Torch, setting Torch device to CUDA",
+    "running_in_amd": "AMD available in Torch, setting Torch device to AMD",
+    "onnx_have": "ONNXruntime available {have}, enabling acceleration",
+    "onnx_not_have": "{have} not available in ONNXruntime; acceleration will NOT be enabled",
+    "download_error": "Failed to download file from {url}, response code: {status_code}",
+    "vip_print": "Hey there, if you haven't subscribed, please consider supporting UVR's developer, Anjok07, by subscribing here: https://patreon.com/uvr",
+    "loading_model": "Loading model {model_filename}...",
+    "model_type_not_support": "Unsupported model type: {model_type}",
+    "starting_separator": "Starting separation process for audio file path",
+    "separator_success_3": "Separation process completed.",
+    "separator_duration": "Separation duration",
+    "dims": "Cannot use sin/cos position encoding with odd dimensions (dim={dims})",
+    "activation": "activation must be relu/gelu, not {activation}",
+    "length_or_training_length": "Provided length {length} exceeds training duration {training_length}",
+    "type_not_valid": "Invalid type for",
+    "del_parameter": "Removing non-existent parameter ",
+    "convert_shape": "Converted mix shape: {shape}",
+    "not_success": "Process was not successful: ",
+    "resample_error": "Error during resampling",
+    "shapes": "Shapes",
+    "wav_resolution": "Resolution type",
+    "warnings": "Warning: Extremely aggressive values detected",
+    "warnings_2": "Warning: NaN or infinite values detected in wave input. Shape",
+    "process_file": "Processing file... \n",
+    "save_instruments": "Saving reverse track...",
+    "assert": "Audio files must have the same shape - Mix: {mixshape}, Inst: {instrumentalshape}",
+    "rubberband": "Rubberband CLI cannot be executed. Please ensure Rubberband-CLI is installed.",
+    "rate": "Rate must be strictly positive",
+    "gdown_error": "Could not retrieve the public link for the file. You may need to change its permissions to 'Anyone with the link' or there may already be excessive access permissions.",
+    "gdown_value_error": "A path or ID must be specified",
+    "missing_url": "URL is missing",
+    "mac_not_match": "MAC does not match",
+    "file_not_access": "File is not accessible",
+    "int_resp==-3": "Request failed, retrying",
+    "search_separate": "Search for separate files...",
+    "found_choice": "Found {choice}",
+    "separator==0": "No separate files found!",
+    "select_separate": "Select separate files",
+    "start_app": "Starting interface...",
+    "provide_audio": "Enter the path to the audio file",
+    "set_torch_mps": "Set Torch device to MPS",
+    "googletts": "Convert text using Google",
+    "pitch_info_2": "Pitch adjustment for text-to-speech converter",
+    "waveform": "Waveform must have the shape (# frames, # channels)",
+    "freq_mask_smooth_hz": "freq_mask_smooth_hz must be at least {hz}Hz",
+    "time_mask_smooth_ms": "time_mask_smooth_ms must be at least {ms}ms",
+    "x": "x must be greater",
+    "xn": "xn must be greater",
+    "not_found_pid": "No processes found!",
+    "end_pid": "Process terminated!",
+    "not_found_separate_model": "No separation model files found!",
+    "not_found_pretrained": "No pretrained model files found!",
+    "not_found_log": "No log files found!",
+    "not_found_predictors": "No predictor model files found!",
+    "not_found_embedders": "No embedder model files found!",
+    "provide_folder": "Please provide a valid folder!",
+    "empty_folder": "The data folder is empty!",
+    "vocoder": "Vocoder",
+    "vocoder_info": "A vocoder analyzes and synthesizes human speech signals for voice transformation.\n\nDefault: This option is HiFi-GAN-NSF, compatible with all RVCs\n\nMRF-HiFi-GAN: Higher fidelity.\n\nRefineGAN: Superior sound quality.",
+    "code_error": "Error: Received status code",
+    "json_error": "Error: Unable to parse response.",
+    "requests_error": "Request failed: {e}",
+    "memory_efficient_training": "Using memory-efficient training",
+    "not_use_pretrain_error_download": "Will not use pretrained models due to missing files",
+    "provide_file_settings": "Please provide a preset settings file!",
+    "load_presets": "Loaded preset file {presets}",
+    "provide_filename_settings": "Please provide a preset file name!",
+    "choose1": "Please select one to export!",
+    "export_settings": "Exported preset file {name}",
+    "use_presets": "Using preset file",
+    "file_preset": "Preset file",
+    "load_file": "Load file",
+    "export_file": "Export preset file",
+    "save_clean": "Save cleanup",
+    "save_autotune": "Save autotune",
+    "save_pitch": "Save pitch",
+    "save_index_2": "Save index impact",
+    "save_resample": "Save resampling",
+    "save_filter": "Save median filter",
+    "save_envelope": "Save sound envelope",
+    "save_protect": "Save sound protection",
+    "save_split": "Save sound split",
+    "filename_to_save": "File name to save",
+    "upload_presets": "Upload preset file",
+    "stop": "Stop process",
+    "stop_separate": "Stop Music Separation",
+    "stop_convert": "Stop Conversion",
+    "stop_create_dataset": "Stop Dataset Creation",
+    "stop_training": "Stop Training",
+    "stop_preprocess": "Stop Data Processing",
+    "stop_extract": "Stop Data Extraction",
+    "not_found_presets": "No preset files found in the folder!",
+    "port": "Port {port} is unavailable! Lowering port by one...",
+    "empty_json": "{file}: Corrupted or empty",
+    "thank": "Thank you for reporting the issue, and apologies for any inconvenience caused!",
+    "error_read_log": "An error occurred while reading log files!",
+    "error_send": "An error occurred while sending the report! Please contact me on Discord: pham_huynh_anh!",
+    "report_bugs": "Report Bugs",
+    "agree_log": "Agree to provide all log files",
+    "error_info": "Error description",
+    "error_info_2": "Provide more information about the error",
+    "report_bug_info": "Report bugs encountered during program usage",
+    "sr_info": "NOTE: SOME FORMATS DO NOT SUPPORT RATES ABOVE 48000",
+    "report_info": "If possible, agree to provide log files to help with debugging.\n\nIf log files are not provided, please describe the error in detail, including when and where it occurred.\n\nIf this reporting system also fails, you can reach out via [ISSUE]({github}) or Discord: `pham_huynh_anh`",
+    "default_setting": "An error occurred during separation, resetting all settings to default...",
+    "dataset_folder1": "Please enter the data folder name",
+    "checkpointing_err": "Pretrained model parameters such as sample rate or architecture do not match the selected model.",
+    "start_onnx_export": "Start converting model to onnx...",
+    "convert_model": "Convert Model",
+    "pytorch2onnx": "Converting PYTORCH Model to ONNX Model",
+    "pytorch2onnx_markdown": "Convert RVC model from pytorch to onnx to optimize audio conversion",
+    "error_readfile": "An error occurred while reading the file!",
+    "f0_onnx_mode": "F0 ONNX Mode",
+    "f0_onnx_mode_info": "Extracting pitch using the ONNX model can help improve speed",
+    "formantshift": "Pitch and Formant Shift",
+    "formant_qfrency": "Frequency for Formant Shift",
+    "formant_timbre": "Timbre for Formant Transformation",
+    "time_frames": "Time (Frames)",
+    "Frequency": "Frequency (Hz)",
+    "f0_extractor_tab": "F0 Extraction",
+    "f0_extractor_markdown": "## Pitch Extraction",
+    "f0_extractor_markdown_2": "F0 pitch extraction is intended for use in audio conversion inference",
+    "start_extract": "Starting extraction process...",
+    "extract_done": "Extraction process completed!",
+    "f0_file": "Use pre-extracted F0 file",
+    "upload_f0": "Upload F0 file",
+    "f0_file_2": "F0 File",
+    "clean_f0_file": "Clean up F0 file",
+    "embed_mode": "Embedders Mode",
+    "embed_mode_info": "Extracting embeddings using different models",
+    "close": "The application is shutting down...",
+    "start_whisper": "Starting voice recognition with Whisper...",
+    "whisper_done": "Voice recognition complete!",
+    "process_audio": "Preprocessing audio...",
+    "process_done_start_convert": "Audio processing complete! proceeding with audio conversion...",
+    "convert_with_whisper": "Convert Audio With Whisper",
+    "convert_with_whisper_info": "Convert audio using a trained speech model with a Whisper model for speech recognition\n\nWhisper will recognize different voices then cut the individual voices and use the RVC model to convert those segments\n\nThe Whisper model may not work properly which may cause strange output",
+    "num_spk": "Number of voices",
+    "num_spk_info": "Number of voices in the audio",
+    "model_size": "Whisper model size",
+    "model_size_info": "Whisper model size\n\nLarge models can produce strange outputs",
+    "title": "Simple high-quality and high-performance voice and instrument conversion and training tool for Vietnamese people",
+    "fp16_not_support": "CPU, MPS and OCL does not support fp16 well, convert fp16 -> fp32",
+    "precision": "Precision",
+    "precision_info": "Precision of inference and model training\n\nNote: CPU Does not support fp16",
+    "update_precision": "Update Precision",
+    "start_update_precision": "Start updating precision",
+    "deterministic": "Deterministic algorithm",
+    "deterministic_info": "When enabled, highly deterministic algorithms are used, ensuring that each run of the same input data will yield the same results.\n\nWhen disabled, more optimal algorithms may be selected but may not be fully deterministic, resulting in different training results between runs.",
+    "benchmark": "Benchmark algorithm",
+    "benchmark_info": "When enabled, it will test and select the most optimized algorithm for the specific hardware and size. This can help speed up training.\n\nWhen disabled, it will not perform this algorithm optimization, which can reduce speed but ensures that each run uses the same algorithm, which is useful if you want to reproduce exactly.",
+    "font": "Font",
+    "font_info": "Interface font\n\nVisit [Google Font](https://fonts.google.com) to choose your favorite font.",
+    "change_font": "Change Font",
+    "f0_unlock": "Unlock all",
+    "f0_unlock_info": "Unlock all pitch extraction methods",
+    "srt": "SRT file is empty or corrupt!",
+    "optimizer": "Optimizer",
+    "optimizer_info": "Optimizer in training, AdamW is default, RAdam is another optimizer",
+    "main_volume": "Main audio file volume",
+    "main_volume_info": "Main audio file volume. Should be between -4 and 0.",
+    "combination_volume": "Combination audio file volume",
+    "combination_volume_info": "Combination audio file volume. Should keep the volume of the combination file lower than the main audio.",
+    "inference": "Inference",
+    "extra": "Extra",
+    "running_local_url": "Running Interface On Local Url",
+    "running_share_url": "Running Interface On Public Url",
+    "translate": "Translate",
+    "source_lang": "Input language",
+    "target_lang": "Output language",
+    "prompt_warning": "Please enter text to start translating!",
+    "read_error": "An error occurred while reading the text file!",
+    "quirk": "Quirk Effects",
+    "quirk_info": "## Weird Effects for Audio",
+    "quirk_label": "Quirk effects",
+    "quirk_label_info": "Quirk effects that can be used to apply to audio",
+    "quirk_markdown": "Apply quirky effects to your audio to make it weird and weird.",
+    "gradio_start": "Interface loaded successfully after",
+    "quirk_choice": {"Random": 0, "Voice Crack": 1, "Horror": 2, "Robot": 3, "Baby": 4, "Depression": 5, "Voice Jerking": 6, "Oldster": 7, "Echo": 8, "Devil": 9, "Distorted Voice": 10, "Online Sales": 11, "Drag": 12, "Uncomfortable": 13, "Noise": 14, "Connectivity Issue": 15, "Disorder": 16},
+    "proposal_pitch": "Automatically propose pitch",
+    "hybrid_calc": "Hybrid calculation for method: {f0_method}...",
+    "proposal_f0": "Proposed pitch: {up_key}",
+    "startautotune": "Start autotune pitch...",
+    "proposal_pitch_threshold": "Proposal Pitch Threshold",
+    "proposal_pitch_threshold_info": "Proposal Pitch Threshold, for male models use 155.0 and female models use 255.0",
+    "rms_start_extract": "Starting audio energy extraction with {num_processes} cores...",
+    "rms_success_extract": "Energy extraction completed in {elapsed_time} seconds.",
+    "train&energy": "Training with energy",
+    "train&energy_info": "Training model with RMS energy",
+    "editing": "Editing",
+    "check_assets_error": "Downloading assets failed {count} times in a row! Please download manually and place in the assets folder: https://huggingface.co/AnhP/Vietnamese-RVC-Project"
+}

assets/languages/vi-VN.json ADDED Viewed

	@@ -0,0 +1,663 @@

+{
+    "set_lang": "Đã đặt ngôn ngữ hiển thị là {lang}",
+    "no_support_gpu": "Thật không may, không có GPU tương thích để hỗ trợ việc đào tạo của bạn.",
+    "text": "văn bản",
+    "upload_success": "Đã tải lên tệp {name} hoàn tất.",
+    "download_url": "Tải từ đường dẫn liên kết",
+    "download_from_csv": "Tải từ kho mô hình csv",
+    "search_models": "Tìm kiếm mô hình",
+    "upload": "Tải lên",
+    "option_not_valid": "Tùy chọn không hợp lệ!",
+    "list_model": "Danh sách mô hình",
+    "success": "Hoàn tất!",
+    "index": "chỉ mục",
+    "model": "mô hình",
+    "zip": "nén",
+    "search": "tìm kiếm",
+    "provide_file": "Vui lòng cung cấp tệp {filename} hợp lệ!",
+    "start": "Bắt đầu {start}...",
+    "not_found": "Không tìm thấy {name}",
+    "found": "Đã tìm thấy {results} kết quả!",
+    "download_music": "tải nhạc",
+    "download": "tải xuống",
+    "provide_url": "Vui lòng cung cấp đường dẫn liên kết.",
+    "provide_name_is_save": "Vui lòng cung cấp tên mô hình để lưu.",
+    "not_support_url": "Liên kết mô hình của bạn không được hỗ trợ.",
+    "error_occurred": "Đã xảy ra lỗi: {e}",
+    "unable_analyze_model": "Không phân tích được mô hình!",
+    "download_pretrain": "Tải xuống huấn luyện trước...",
+    "provide_pretrain": "Vui lòng cung cấp đường dẫn mô hình huấn luyện trước {dg}.",
+    "sr_not_same": "Tốc độ lấy mẫu của hai mô hình không giống nhau",
+    "architectures_not_same": "Không thể hợp nhất các mô hình. Các kiến trúc mô hình không giống nhau.",
+    "fushion_model": "dung hợp mô hình",
+    "model_fushion_info": "Mô hình được {name} được dung hợp từ {pth_1} và {pth_2} với ratio {ratio}.",
+    "not_found_create_time": "Không tìm thấy thời gian tạo.",
+    "format_not_valid": "Định dạng không hợp lệ.",
+    "read_info": "Các mô hình được huấn luyện trên các ứng dụng khác nhau có thể đem lại các thông tin khác nhau hoặc không thể đọc!",
+    "epoch": "kỷ nguyên.",
+    "step": "bước",
+    "sr": "Tốc độ lấy mẫu",
+    "f0": "huấn luyện cao độ",
+    "version": "phiên bản.",
+    "not_f0": "Không được huấn luyện cao độ",
+    "trained_f0": "Được huấn luyện cao độ",
+    "model_info": "Tên mô hình: {model_name}\n\n Người tạo mô hình: {model_author}\n\nKỷ nguyên: {epochs}\n\nSố bước: {steps}\n\nPhiên bản của mô hình: {version}\n\nTốc độ lấy mẫu: {sr}\n\nHuấn luyện cao độ: {pitch_guidance}\n\nHash (ID): {model_hash}\n\nThời gian tạo: {creation_date_str}\n\nBộ mã hóa: {vocoder}\n\nNăng lượng: {rms_extract}\n",
+    "input_not_valid": "Vui lòng nhập đầu vào hợp lệ!",
+    "output_not_valid": "Vui lòng nhập đầu ra hợp lệ!",
+    "apply_effect": "áp dụng hiệu ứng",
+    "enter_the_text": "Vui lòng nhập văn bản để chuyển!",
+    "choose_voice": "Vui lòng chọn giọng!",
+    "convert": "Chuyển đổi {name}...",
+    "separator_music": "tách nhạc",
+    "notfound": "Không tìm thấy",
+    "turn_on_use_audio": "Vui lòng bật sử dụng âm thanh vừa tách để sử dụng",
+    "turn_off_convert_backup": "Tắt chuyển đổi giọng bè để có thể sử dụng giọng gốc",
+    "turn_off_merge_backup": "Tắt không kết hợp giọng bè để có thể sử dụng giọng gốc",
+    "not_found_original_vocal": "Không tìm thấy giọng gốc!",
+    "convert_vocal": "Đang chuyển đổi giọng nói...",
+    "convert_success": "Đã hoàn tất chuyển đổi giọng nói!",
+    "convert_backup": "Đang chuyển đổi giọng bè...",
+    "convert_backup_success": "Đã Hoàn tất chuyển đổi giọng bè!",
+    "merge_backup": "Kết hợp giọng với giọng bè...",
+    "merge_success": "Kết hợp Hoàn tất.",
+    "is_folder": "Đầu vào là một thư mục: Chuyển đổi tất cả tệp âm thanh trong thư mục...",
+    "not_found_in_folder": "Không tìm thấy tệp âm thanh trong thư mục!",
+    "batch_convert": "Đang chuyển đổi hàng loạt...",
+    "batch_convert_success": "Chuyển đổi hàng loạt hoàn tất!",
+    "create": "tạo",
+    "provide_name": "Vui lòng cung cấp tên mô hình.",
+    "not_found_data": "Không tìm thấy dữ liệu",
+    "not_found_data_preprocess": "Không tìm thấy dữ liệu được xử lý, vui lòng xử lý lại âm thanh",
+    "not_found_data_extract": "Không tìm thấy dữ liệu được trích xuất, vui lòng trích xuất lại âm thanh",
+    "provide_pretrained": "Vui lòng nhập huấn luyện {dg}",
+    "download_pretrained": "Tải xuống huấn luyện trước {dg}{rvc_version} gốc",
+    "not_found_pretrain": "Không tìm thấy huấn luyện trước {dg}",
+    "not_use_pretrain": "Sẽ không có huấn luyện trước được sử dụng",
+    "training": "huấn luyện",
+    "rick_roll": "Bấm vào đây nếu bạn muốn bị Rick Roll:) ---> [RickRoll]({rickroll})",
+    "terms_of_use": "**Vui lòng không sử dụng Dự án với bất kỳ mục đích nào vi phạm đạo đức, pháp luật, hoặc gây tổn hại đến cá nhân, tổ chức...**",
+    "exemption": "**Trong trường hợp người sử dụng không tuân thủ các điều khoản hoặc vi phạm, tôi sẽ không chịu trách nhiệm về bất kỳ khiếu nại, thiệt hại, hay trách nhiệm pháp lý nào, dù là trong hợp đồng, do sơ suất, hay các lý do khác, phát sinh từ, ngoài, hoặc liên quan đến phần mềm, việc sử dụng phần mềm hoặc các giao dịch khác liên quan đến phần mềm.**",
+    "separator_tab": "Tách Nhạc",
+    "4_part": "Một hệ thống tách nhạc đơn giản có thể tách được 4 phần: Nhạc, giọng, giọng chính, giọng bè",
+    "clear_audio": "Làm sạch âm thanh",
+    "separator_backing": "Tách giọng bè",
+    "denoise_mdx": "Khữ tách MDX",
+    "use_mdx": "Sử dụng MDX",
+    "dereveb_audio": "Tách vang",
+    "dereveb_backing": "Tách vang bè",
+    "separator_model": "Mô hình tách nhạc",
+    "separator_backing_model": "Mô hình tách bè",
+    "shift": "Số lượng dự đoán",
+    "shift_info": "Càng cao chất lượng càng tốt nhưng lâu nhưng tốn tài nguyên",
+    "segments_size": "Kích Thước Phân Đoạn",
+    "segments_size_info": "Càng cao chất lượng càng tốt nhưng tốn tài nguyên",
+    "batch_size": "Kích thước lô",
+    "batch_size_info": "Số lượng mẫu xử lý đồng thời trong một lần huấn luyện. Cao có thể gây tràn bộ nhớ",
+    "mdx_batch_size_info": "Số lượng mẫu được xử lý cùng một lúc. Việc chia thành các lô giúp tối ưu hóa quá trình tính toán. Lô quá lớn có thể làm tràn bộ nhớ, khi lô quá nhỏ sẽ làm giảm hiệu quả dùng tài nguyên",
+    "overlap": "Chồng chéo",
+    "overlap_info": "Số lượng chồng chéo giữa các cửa sổ dự đoán",
+    "export_format": "Định dạng âm thanh",
+    "export_info": "Định dạng âm thanh khi xuất tệp âm thanh ra",
+    "output_separator": "Âm thanh đã được tách",
+    "hop_length_info": "Khoảng thời gian chuyển cửa sổ phân tích khi thực hiện phép biến đổi. Giá trị nhỏ độ chi tiết cao nhưng cần tính toán nhiều hơn",
+    "drop_audio": "Thả âm thanh vào đây",
+    "drop_text": "Thả tệp văn bản vào đây",
+    "use_url": "Sử dụng đường dẫn youtube",
+    "url_audio": "Đường dẫn liên kết đến âm thanh",
+    "downloads": "Tải Xuống",
+    "clean_strength": "Mức độ làm sạch âm thanh",
+    "clean_strength_info": "Mức độ của bộ làm sạch âm thanh để lọc giọng hát khi xuất",
+    "input_output": "Đầu vào, đầu ra âm thanh",
+    "audio_path": "Đường dẫn đầu vào âm thanh",
+    "refresh": "Tải lại",
+    "output_folder": "Đường dẫn thư mục đầu ra âm thanh",
+    "output_folder_info": "Nhập đường dẫn thư mục âm thanh sẽ xuất ra ở đó",
+    "input_audio": "Đầu vào âm thanh",
+    "instruments": "Nhạc nền",
+    "original_vocal": "Giọng gốc",
+    "main_vocal": "Giọng chính",
+    "backing_vocal": "Giọng bè",
+    "convert_audio": "Chuyển Đổi Âm Thanh",
+    "convert_info": "Chuyển đổi âm thanh bằng mô hình giọng nói đã được huấn luyện",
+    "autotune": "Tự động điều chỉnh",
+    "use_audio": "Sử dụng âm thanh vừa tách",
+    "convert_original": "Chuyển đổi giọng gốc",
+    "convert_backing": "Chuyển đổi giọng bè",
+    "not_merge_backing": "Không kết hợp giọng bè",
+    "merge_instruments": "Kết hợp nhạc nền",
+    "pitch": "Cao độ",
+    "pitch_info": "Khuyến cáo: chỉnh lên 12 để chuyển giọng nam thành nữ và ngược lại",
+    "model_accordion": "Mô hình và chỉ mục",
+    "model_name": "Tệp mô hình",
+    "index_path": "Tệp chỉ mục",
+    "index_strength": "Ảnh hưởng của chỉ mục",
+    "index_strength_info": "Càng cao ảnh hưởng càng lớn. Tuy nhiên, việc chọn giá trị thấp hơn có thể giảm hiện tượng giả trong âm thanh",
+    "output_path": "Đường dẫn đầu ra âm thanh",
+    "output_path_info": "Nhập đường dẫn đầu ra(cứ để định dạng .wav khi chuyển đổi nó tự sửa)",
+    "setting": "Cài đặt chung",
+    "f0_method": "Phương pháp trích xuất",
+    "f0_method_info": "Phương pháp để trích xuất dữ liệu",
+    "f0_method_hybrid": "Phương pháp trích xuất HYBRID",
+    "f0_method_hybrid_info": "Sự kết hợp của hai hoặc nhiều loại trích xuất khác nhau",
+    "hubert_model": "Mô hình nhúng",
+    "hubert_info": "Mô hình được huấn luyện trước để giúp nhúng",
+    "modelname": "Tên của mô hình",
+    "modelname_info": "Nếu bạn có mô hình riêng chỉ cần tải và nhập tên của mô hình vào đây",
+    "split_audio": "Cắt âm thanh",
+    "autotune_rate": "Mức độ điều chỉnh",
+    "autotune_rate_info": "Mức độ điều chỉnh tự động",
+    "resample": "Lấy mẫu lại",
+    "resample_info": "Lấy mẫu lại sau xử lý đến tốc độ lấy mẫu cuối cùng, 0 có nghĩa là không lấy mẫu lại, LƯU Ý: MỘT SỐ ĐỊNH DẠNG KHÔNG HỖ TRỢ TỐC ĐỘ TRÊN 48000",
+    "filter_radius": "Lọc trung vị",
+    "filter_radius_info": "Nếu giá trị lớn hơn ba sẽ áp dụng tính năng lọc trung vị. Giá trị đại diện cho bán kính bộ lọc và có thể làm giảm hơi thở hoặc tắt thở.",
+    "rms_mix_rate": "Tỷ lệ trộn RMS",
+    "rms_mix_rate_info": "Xác định tỷ lệ pha trộn giữa năng lượng RMS của giọng gốc và giọng đã chuyển đổi",
+    "protect": "Bảo vệ phụ âm",
+    "protect_info": "Bảo vệ các phụ âm riêng biệt và âm thanh thở ngăn chặn việc rách điện âm và các hiện tượng giả khác. Việc chỉnh tối đa sẽ bảo vệ toàn diện. Việc giảm giá trị này có thể giảm độ bảo vệ, đồng thời có khả năng giảm thiểu hiệu ứng lập chỉ mục",
+    "output_convert": "Âm thanh đã được chuyển đổi",
+    "main_convert": "Chuyển đổi giọng chính",
+    "main_or_backing": "Giọng chính + Giọng bè",
+    "voice_or_instruments": "Giọng + Nhạc nền",
+    "convert_text": "Chuyển Đổi Văn Bản",
+    "convert_text_markdown": "## Chuyển Đổi Văn Bản Thành Giọng Nói",
+    "convert_text_markdown_2": "Chuyển văn bản thành giọng nói và đọc lại bằng mô hình giọng nói được huấn luyện",
+    "input_txt": "Nhập dữ liệu từ tệp văn bản txt",
+    "text_to_speech": "Văn bản cần đọc",
+    "voice_speed": "Tốc độ đọc",
+    "voice_speed_info": "Tốc độ đọc của giọng nói",
+    "tts_1": "1. Chuyển Đổi Văn Bản",
+    "tts_2": "2. Chuyển Đổi Giọng Nói",
+    "voice": "Giọng nói của các nước",
+    "output_tts": "Đường dẫn đầu ra giọng nói",
+    "output_tts_convert": "Đường dẫn đầu ra giọng chuyển đổi",
+    "tts_output": "Nhập đường dẫn đầu ra",
+    "output_tts_markdown": "Âm thanh chưa được chuyển đổi và âm thanh đã được chuyển đổi",
+    "output_text_to_speech": "Giọng được tạo bởi chuyển đổi văn bản thành giọng nói",
+    "output_file_tts_convert": "Giọng được chuyển đổi bởi mô hình",
+    "output_audio": "Đầu ra âm thanh",
+    "provide_output": "Nhập đường dẫn đầu ra",
+    "audio_effects": "Hiệu Ứng Âm Thanh",
+    "apply_audio_effects": "## Áp Dụng Thêm Hiệu Ứng Cho Âm Thanh",
+    "audio_effects_edit": "Chỉnh sửa thêm hiệu ứng cho âm thanh",
+    "reverb": "Hiệu ứng vọng âm",
+    "chorus": "Hiệu ứng hòa âm",
+    "delay": "Hiệu ứng độ trễ",
+    "more_option": "Tùy chọn thêm",
+    "phaser": "Hiệu ứng xoay pha",
+    "compressor": "Hiệu ứng nén",
+    "apply": "Áp dụng",
+    "reverb_freeze": "Chế độ đóng băng",
+    "reverb_freeze_info": "Tạo hiệu ứng vang liên tục khi bật chế độ này",
+    "room_size": "Kích thước phòng",
+    "room_size_info": "Điều chỉnh không gian của phòng để tạo độ vang",
+    "damping": "Giảm âm",
+    "damping_info": "Điều chỉnh độ hút âm, kiểm soát mức độ vang",
+    "wet_level": "Mức độ tín hiệu vang",
+    "wet_level_info": "Điều chỉnh mức độ của tín hiệu có hiệu ứng vọng âm",
+    "dry_level": "Mức độ tín hiệu gốc",
+    "dry_level_info": "Điều chỉnh mức độ của tín hiệu không có hiệu ứng",
+    "width": "Chiều rộng âm thanh",
+    "width_info": "Điều chỉnh độ rộng của không gian âm thanh",
+    "chorus_depth": "Giảm âm",
+    "chorus_depth_info": "Điều chỉnh cường độ hòa âm, tạo ra cảm giác rộng cho âm thanh",
+    "chorus_rate_hz": "Tần số",
+    "chorus_rate_hz_info": "Điều chỉnh tốc độ dao động của hòa âm",
+    "chorus_mix": "Trộn tín hiệu",
+    "chorus_mix_info": "Điều chỉnh mức độ trộn giữa âm gốc và âm có hiệu ứng",
+    "chorus_centre_delay_ms": "Đỗ trễ trung tâm (mili giây)",
+    "chorus_centre_delay_ms_info": "Khoảng thời gian trễ giữa các kênh stereo ��ể tạo hiệu ứng hòa âm",
+    "chorus_feedback": "Phản hồi",
+    "chorus_feedback_info": "Điều chỉnh lượng tín hiệu hiệu ứng được quay lại vào tín hiệu gốc",
+    "delay_seconds": "Thời gian trễ",
+    "delay_seconds_info": "Điều chỉnh khoảng thời gian trễ giữa âm gốc và âm có hiệu ứng",
+    "delay_feedback": "Phản hồi độ trễ",
+    "delay_feedback_info": "Điều chỉnh lượng tín hiệu được quay lại, tạo hiệu ứng lặp lại",
+    "delay_mix": "Trộn tín hiệu độ trễ",
+    "delay_mix_info": "Điều chỉnh mức độ trộn giữa âm gốc và âm trễ",
+    "fade": "Hiệu ứng mờ dần",
+    "bass_or_treble": "Âm trầm và âm cao",
+    "limiter": "Giới hạn ngưỡng",
+    "distortion": "Hiệu ứng nhiễu âm",
+    "gain": "Cường độ âm",
+    "bitcrush": "Hiệu ứng giảm bits",
+    "clipping": "Hiệu ứng méo âm",
+    "fade_in": "Hiệu ứng mờ dần vào (mili giây)",
+    "fade_in_info": "Thời gian mà âm thanh sẽ tăng dần từ mức 0 đến mức bình thường",
+    "fade_out": "Hiệu ứng mờ dần ra (mili giây)",
+    "fade_out_info": "thời gian mà âm thanh sẽ giảm dần từ bình thường xuống mức 0",
+    "bass_boost": "Độ khuếch đại âm trầm (db)",
+    "bass_boost_info": "mức độ tăng cường âm trầm trong đoạn âm thanh",
+    "bass_frequency": "Tần số cắt của bộ lọc thông thấp (Hz)",
+    "bass_frequency_info": "tần số bị giảm. Tần số thấp sẽ làm âm trầm rõ hơn",
+    "treble_boost": "Độ khuếch đại âm cao (db)",
+    "treble_boost_info": "mức độ tăng cường âm cao trong đoạn âm thanh",
+    "treble_frequency": "Tần số cắt của bộ lọc thông cao (Hz)",
+    "treble_frequency_info": "tần số sẽ lọc bỏ. Tần số càng cao thì giữ lại âm càng cao",
+    "limiter_threshold_db": "Ngưỡng giới hạn",
+    "limiter_threshold_db_info": "Giới hạn mức độ âm thanh tối đa, ngăn không cho vượt quá ngưỡng",
+    "limiter_release_ms": "Thời gian thả",
+    "limiter_release_ms_info": "Khoảng thời gian để âm thanh trở lại sau khi bị giới hạn (Mili Giây)",
+    "distortion_info": "Điều chỉnh mức độ nhiễu âm, tạo hiệu ứng méo tiếng",
+    "gain_info": "Tăng giảm âm lượng của tín hiệu",
+    "clipping_threshold_db": "Ngưỡng cắt",
+    "clipping_threshold_db_info": "Cắt bớt tín hiệu vượt quá ngưỡng, tạo âm thanh méo",
+    "bitcrush_bit_depth": "Độ sâu bit",
+    "bitcrush_bit_depth_info": "Giảm chất lượng âm thanh bằng cách giảm số bit, tạo hiệu ứng âm thanh bị méo",
+    "phaser_depth": "Độ sâu",
+    "phaser_depth_info": "Điều chỉnh độ sâu của hiệu ứng, ảnh hưởng đến cường độ của hiệu ứng xoay pha",
+    "phaser_rate_hz": "Tần số",
+    "phaser_rate_hz_info": "Điều chỉnh tốc độ của hiệu ứng hiệu ứng xoay pha",
+    "phaser_mix": "Trộn tín hiệu",
+    "phaser_mix_info": "Điều chỉnh mức độ trộn giữa tín hiệu gốc và tín hiệu đã qua xử lý",
+    "phaser_centre_frequency_hz": "Tần số trung tâm",
+    "phaser_centre_frequency_hz_info": "Tần số trung tâm của hiệu ứng xoay pha, ảnh hưởng đến tần số bị điều chỉnh",
+    "phaser_feedback": "Phản hồi",
+    "phaser_feedback_info": "Điều chỉnh lượng phản hồi tín hiệu, tạo cảm giác xoay pha mạnh hoặc nhẹ",
+    "compressor_threshold_db": "Ngưỡng nén",
+    "compressor_threshold_db_info": "Ngưỡng mức âm thanh sẽ bị nén khi vượt qua ngưỡng này",
+    "compressor_ratio": "Tỉ lệ nén",
+    "compressor_ratio_info": "Điều chỉnh mức độ nén âm thanh khi vượt qua ngưỡng",
+    "compressor_attack_ms": "Thời gian tấn công (mili giây)",
+    "compressor_attack_ms_info": "Khoảng thời gian nén bắt đầu tác dụng sau khi âm thanh vượt ngưỡng",
+    "compressor_release_ms": "Thời gian thả",
+    "compressor_release_ms_info": "Thời gian để âm thanh trở lại trạng thái bình thường sau khi bị nén",
+    "create_dataset_url": "Đường dẫn liên kết đến âm thanh(sử dụng dấu , để sử dụng nhiều liên kết)",
+    "createdataset": "Tạo dữ liệu",
+    "create_dataset_markdown": "## Tạo Dữ Liệu Huấn Luyện Từ Youtube",
+    "create_dataset_markdown_2": "Xử lý và tạo tập tin dữ liệu huấn luyện bằng đường dẫn youtube",
+    "denoise": "Khử tách mô hình",
+    "skip": "Bỏ qua giây",
+    "model_ver": "Phiên bản tách giọng",
+    "model_ver_info": "Phiên bản của mô hình tách nhạc để tách giọng",
+    "create_dataset_info": "Thông tin tạo dữ liệu",
+    "output_data": "Đầu ra dữ liệu",
+    "output_data_info": "Đầu ra dữ liệu sau khi tạo xong dữ liệu",
+    "skip_start": "Bỏ qua phần đầu",
+    "skip_start_info": "Bỏ qua số giây đầu của âm thanh, dùng dấu , để sử dụng cho nhiều âm thanh",
+    "skip_end": "Bỏ qua phần cuối",
+    "skip_end_info": "Bỏ qua số giây cuối của âm thanh, dùng dấu , để sử dụng cho nhiều âm thanh",
+    "training_model": "Huấn Luyện Mô Hình",
+    "training_markdown": "Huấn luyện và đào tạo mô hình giọng nói bằng một lượng dữ liệu giọng nói",
+    "training_model_name": "Tên của mô hình khi huấn luyện(không sử dụng ký tự đặc biệt hay dấu cách)",
+    "sample_rate": "Tỉ lệ lấy mẫu",
+    "sample_rate_info": "Tỉ lệ lấy mẫu của mô hình",
+    "training_version": "Phiên bản mô hình",
+    "training_version_info": "Phiên bản mô hình khi huấn luyện",
+    "training_pitch": "Huấn luyện cao độ",
+    "upload_dataset": "Tải lên dữ liệu huấn luyện",
+    "preprocess_effect": "Xử lý hậu kỳ",
+    "clear_dataset": "Làm sạch dữ liệu",
+    "preprocess_info": "Thông tin phần xử lý trước",
+    "preprocess_button": "1. Xử lý dữ liệu",
+    "extract_button": "2. Trích xuất dữ liệu",
+    "extract_info": "Thông tin phần trích xuất dữ liệu",
+    "total_epoch": "Tổng số kỷ nguyên",
+    "total_epoch_info": "Tổng số kỷ nguyên huấn luyện đào tạo",
+    "save_epoch": "Tần suất lưu",
+    "save_epoch_info": "Tần suất lưu mô hình khi huấn luyện, giúp việc huấn luyện lại mô hình",
+    "create_index": "Tạo chỉ mục",
+    "index_algorithm": "Thuật toán chỉ mục",
+    "index_algorithm_info": "Thuật toán tạo chỉ mục",
+    "custom_dataset": "Tùy chọn thư mục",
+    "custom_dataset_info": "Tùy chọn thư mục dữ liệu huấn luyện",
+    "overtraining_detector": "Kiểm tra quá sức",
+    "overtraining_detector_info": "Kiểm tra huấn luyện mô hình quá sức",
+    "cleanup_training": "Làm sạch huấn luyện",
+    "cleanup_training_info": "Dọn dẹp và huấn luyện lại từ đầu",
+    "cache_in_gpu": "Lưu mô hình vào đệm",
+    "cache_in_gpu_info": "Lưu mô hình vào bộ nhớ đệm gpu",
+    "dataset_folder": "Thư mục chứa dữ liệu",
+    "threshold": "Ngưỡng huấn luyện quá sức",
+    "setting_cpu_gpu": "Tùy chọn CPU/GPU",
+    "gpu_number": "Số gpu được sử dụng",
+    "gpu_number_info": "Số thứ tự của GPU được sử dụng trong huấn luyện. (Lưu ý: GPU AMD không được hỗ trợ huấn luyện đa GPU)",
+    "save_only_latest": "Chỉ lưu mới nhất",
+    "save_only_latest_info": "Chỉ lưu mô hình D và G mới nhất",
+    "save_every_weights": "Lưu mọi mô hình",
+    "save_every_weights_info": "Lưu mọi mô hình sau mỗi lượt kỷ nguyên",
+    "gpu_info": "Thông tin của GPU",
+    "gpu_info_2": "Thông tin của GPU được sử dụng trong huấn luyện",
+    "cpu_core": "Số lõi xử lý có thể sử dụng",
+    "cpu_core_info": "Số lõi được sử dụng trong việc huấn luyện",
+    "not_use_pretrain_2": "Không dùng huấn luyện",
+    "not_use_pretrain_info": "Không dùng huấn luyện trước",
+    "custom_pretrain": "Tùy chỉnh huấn luyện",
+    "custom_pretrain_info": "Tùy chỉnh huấn luyện trước",
+    "pretrain_file": "Tệp mô hình huấn luyện trước {dg}",
+    "train_info": "Thông tin phần huấn luyện",
+    "export_model": "5. Xuất Mô hình",
+    "zip_model": "2. Nén mô hình",
+    "output_zip": "Đầu ra tệp khi nén",
+    "model_path": "Đường dẫn mô hình",
+    "model_ratio": "Tỉ lệ mô hình",
+    "model_ratio_info": "Chỉnh hướng về bên nào sẽ làm cho mô hình giống với bên đó",
+    "output_model_path": "Đầu ra mô hình",
+    "fushion": "Dung Hợp Mô Hình",
+    "fushion_markdown": "## Dung Hợp Hai Mô Hình Với Nhau",
+    "fushion_markdown_2": "Dung hợp hai mô hình giọng nói lại với nhau để tạo thành một mô hình duy nhất",
+    "read_model": "Đọc Thông Tin",
+    "read_model_markdown": "## Đọc Thông Tin Của Mô Hình",
+    "read_model_markdown_2": "Đọc các thông tin được ghi trong mô hình",
+    "drop_model": "Thả mô hình vào đây",
+    "readmodel": "Đọc mô hình",
+    "model_path_info": "Nhập đường dẫn đến tệp mô hình",
+    "modelinfo": "Thông Tin Mô Hình",
+    "download_markdown": "## Tải Xuống Mô Hình",
+    "download_markdown_2": "Tải xuống mô hình giọng nói, mô hình huấn luyện trước, mô hình nhúng",
+    "model_download": "Tải xuống mô hình giọng nói",
+    "model_url": "Đường dẫn liên kết đến mô hình",
+    "30s": "Vui lòng đợi khoảng 30 giây. Hệ thống sẽ tự khởi động lại!",
+    "model_download_select": "Chọn cách tải mô hình",
+    "model_warehouse": "Kho mô hình",
+    "get_model": "Nhận mô hình",
+    "name_to_search": "Tên để tìm kiếm",
+    "search_2": "Tìm kiếm",
+    "select_download_model": "Chọn mô hình đã được tìm kiếm(Bấm vào để chọn)",
+    "download_pretrained_2": "Tải xuống mô hình huấn luyện trước",
+    "pretrained_url": "Đường dẫn liên kết đến mô hình huấn luyện trước {dg}",
+    "select_pretrain": "Chọn mô hình huấn luyện trước",
+    "select_pretrain_info": "Chọn mô hình huấn luyện trước để cài đặt về",
+    "pretrain_sr": "Tốc độ lấy mẫu của mô hình",
+    "drop_pretrain": "Thả mô hình huấn luyện trước {dg} vào đây",
+    "settings": "Tùy Chỉnh",
+    "settings_markdown": "## Tùy Chỉnh Thêm",
+    "settings_markdown_2": "Tùy chỉnh thêm một số tính năng của dự án",
+    "lang": "Ngôn ngữ",
+    "lang_restart": "Ngôn ngữ được hiển thị trong dự án(Khi đổi ngôn ngữ hệ thống sẽ tự khởi động lại sau 30 giây để cập nhật)",
+    "change_lang": "Đổi Ngôn Ngữ",
+    "theme": "Chủ đề",
+    "theme_restart": "Loại Chủ đề của giao diện được hiển thị(Khi đổi chủ đề hệ thống sẽ tự khởi động lại sau 30 giây để cập nhật)",
+    "theme_button": "Đổi Chủ Đề",
+    "change_light_dark": "Đổi Chế Độ Sáng/Tối",
+    "tensorboard_url": "Đường dẫn biểu đồ",
+    "errors_loading_audio": "Lỗi khi tải âm thanh",
+    "apply_error": "Đã xảy ra lỗi khi áp dụng hiệu ứng: {e}",
+    "indexpath": "Đường dẫn chỉ mục",
+    "split_total": "Tổng số phần đã cắt",
+    "process_audio_error": "Đã xảy ra lỗi khi xử lý âm thanh",
+    "merge_error": "Đã xảy ra lỗi khi ghép âm thanh",
+    "not_found_convert_file": "Không tìm thấy tệp đã xử lý",
+    "convert_batch": "Chuyển đổi hàng loạt...",
+    "found_audio": "Tìm thấy {audio_files} tệp âm thanh cho việc chuyển đổi.",
+    "not_found_audio": "Không tìm thấy tệp âm thanh!",
+    "error_convert": "Đã xảy ra lỗi khi chuyển đổi âm thanh: {e}",
+    "convert_batch_success": "Đã chuyển đổi hàng loạt hoàn tất sau {elapsed_time} giây. Đầu ra {output_path}",
+    "convert_audio_success": "Tệp {input_path} được chuyển đổi hoàn tất sau {elapsed_time} giây. Đầu ra {output_path}",
+    "read_faiss_index_error": "Đã xảy ra lỗi khi đọc chỉ mục FAISS: {e}",
+    "read_model_error": "Thất bại khi tải mô hình: {e}",
+    "starting_download": "Bắt đầu tải xuống",
+    "version_not_valid": "Phiên bản tách giọng không hợp lệ",
+    "skip<audio": "Không thể bỏ qua vì số lượng thời gian bỏ qua thấp hơn số lượng tệp âm thanh",
+    "skip>audio": "Không thể bỏ qua vì số lượng thời gian bỏ qua cao hơn số lượng tệp âm thanh",
+    "=<0": "Thời gian bỏ qua bé hơn hoặc bằng 0 nên bỏ qua",
+    "skip_warning": "Thời lượng bỏ qua ({seconds} giây) vượt quá thời lượng âm thanh ({total_duration} giây). Bỏ qua.",
+    "download_success": "Đã tải xuống hoàn tất",
+    "create_dataset_error": "Đã xảy ra lỗi khi tạo dữ liệu huấn luyện",
+    "create_dataset_success": "Quá trình tạo dữ liệu huấn huyện đã hoàn tất sau: {elapsed_time} giây",
+    "skip_start_audio": "Bỏ qua âm thanh đầu hoàn tất: {input_file}",
+    "skip_end_audio": "Bỏ qua âm thanh cuối hoàn tất: {input_file}",
+    "merge_audio": "Đã ghép các phần chứa âm thanh lại",
+    "separator_process": "Đang tách giọng: {input}...",
+    "not_found_main_vocal": "Không tìm thấy giọng chính!",
+    "not_found_backing_vocal": "Không tìm thấy giọng bè!",
+    "not_found_instruments": "Không tìm thấy nhạc nền",
+    "merge_instruments_process": "Kết hợp giọng với nhạc nền...",
+    "dereverb": "Đang tách âm vang",
+    "dereverb_success": "Đã tách âm vang hoàn tất",
+    "save_index": "Đã lưu tệp chỉ mục",
+    "create_index_error": "Đã xảy ra lỗi khi tạo chỉ mục",
+    "sr_not_16000": "Tỉ lệ mẫu phải là 16000",
+    "extract_file_error": "Đã xảy ra lỗi khi giải nén tập tin",
+    "extract_f0_method": "Bắt đầu trích xuất cao độ với {num_processes} lõi với phương pháp trích xuất {f0_method}...",
+    "extract_f0": "Trích Xuất Cao Độ",
+    "extract_f0_success": "Quá trình trích xuất cao độ đã hoàn tất vào {elapsed_time} giây.",
+    "NaN": "chứa giá trị NaN và sẽ bị bỏ qua.",
+    "start_extract_hubert": "Đang bắt đầu nhúng trích xuất...",
+    "process_error": "Đã xảy ra lỗi khi xử lý",
+    "extract_hubert_success": "Quá trình trích xuất nhúng đã hoàn tất trong {elapsed_time} giây.",
+    "export_process": "Đường dẫn của mô hình",
+    "extract_error": "Đã xảy ra lỗi khi trích xuất dữ liệu",
+    "extract_success": "Đã trích xuất hoàn tất mô hình",
+    "start_preprocess": "Đang bắt đầu xử lý dữ liệu với {num_processes} lõi xử lý...",
+    "not_integer": "Thư mục ID giọng nói phải là số nguyên, thay vào đó có",
+    "preprocess_success": "Quá trình xử lý hoàn tất sau {elapsed_time} giây.",
+    "preprocess_model_success": "Đã hoàn tất xử lý trước dữ liệu cho mô hình",
+    "turn_on_dereverb": "Điều kiện cần để sử dụng tách vang giọng bè là phải bật tách vang",
+    "turn_on_separator_backing": "Điều kiện cần để sử dụng tách vang giọng bè là phải bật tách bè",
+    "backing_model_ver": "Phiên bản mô hình của tách bè",
+    "clean_audio_success": "Đã làm sạch âm hoàn tất!",
+    "separator_error": "Đã xảy ra lỗi khi tách nhạc",
+    "separator_success": "Quá trình tách nhạc đã hoàn tất sau: {elapsed_time} giây",
+    "separator_process_2": "Đang xử lý tách nhạc",
+    "separator_success_2": "Đã tách nhạc hoàn tất!",
+    "separator_process_backing": "Đang xử lý tách giọng bè",
+    "separator_process_backing_success": "Đã tách giọng bè hoàn tất!",
+    "process_original": "Đang xử lý tách âm vang giọng gốc...",
+    "process_original_success": "Đã tách âm vang giọng gốc hoàn tất!",
+    "process_main": "Đang xử lý tách âm vang giọng chính...",
+    "process_main_success": "Đã tách âm vang giọng chính hoàn tất!",
+    "process_backing": "Đang xử lý tách âm vang giọng bè...",
+    "process_backing_success": "Đã tách âm vang giọng bè hoàn tất!",
+    "save_every_epoch": "Lưu mô hình sau: ",
+    "total_e": "Tổng số kỷ nguyên huấn luyện: ",
+    "dorg": "Huấn luyện trước G: {pretrainG} | Huấn luyện trước D: {pretrainD}",
+    "training_f0": "Huấn luyện cao độ",
+    "not_gpu": "Không phát hiện thấy GPU, hoàn nguyên về CPU (không khuyến nghị)",
+    "not_found_checkpoint": "Không tìm thấy tệp điểm đã lưu: {checkpoint_path}",
+    "save_checkpoint": "Đã tải lại điểm đã lưu '{checkpoint_path}' (kỷ nguyên {checkpoint_dict})",
+    "save_model": "Đã lưu mô hình '{checkpoint_path}' (kỷ nguyên {iteration})",
+    "sr_does_not_match": "{sample_rate} Tỉ lệ mẫu không khớp với mục tiêu {sample_rate2} Tỉ lệ mẫu",
+    "time_or_speed_training": "thời gian={current_time} | tốc độ huấn luyện={elapsed_time_str}",
+    "savemodel": "Đã lưu mô hình '{model_dir}' (kỷ nguyên {epoch} và bước {step})",
+    "model_author": "Ghi công mô hình cho {model_author}",
+    "unregistered": "Mô hình không được ghi chép",
+    "not_author": "Mô hình không được ghi chép",
+    "training_author": "Tên chủ mô hình",
+    "training_author_info": "Nếu bạn muốn ghi công mô hình hãy nhập tên của bạn vào đây",
+    "extract_model_error": "Đã xảy ra lỗi khi trích xuất mô hình",
+    "start_training": "Bắt đầu huấn luyện",
+    "import_pretrain": "Đã nạp huấn luyện trước ({dg}) '{pretrain}'",
+    "not_using_pretrain": "Sẽ không có huấn luyện trước ({dg}) được sử dụng",
+    "overtraining_find": "Tập luyện quá sức được phát hiện ở kỷ nguyên {epoch} với mất mát g được làm mịn {smoothed_value_gen} và mất mát d được làm mịn {smoothed_value_disc}",
+    "best_epoch": "Kỷ nguyên mới tốt nhất {epoch} với mất mát g được làm mịn {smoothed_value_gen} và mất mát d được làm mịn {smoothed_value_disc}",
+    "success_training": "Đã đào tạo hoàn tất với {epoch} kỷ nguyên, {global_step} các bước và {loss_gen_all} mất mát gen.",
+    "training_info": "Tổn thất gen thấp nhất: {lowest_value_rounded} ở ký nguyên {lowest_value_epoch}, bước {lowest_value_step}",
+    "model_training_info": "{model_name} | kỷ nguyên={epoch} | bước={global_step} | {epoch_recorder} | giá trị thấp nhất={lowest_value_rounded} (kỷ nguyên {lowest_value_epoch} và bước {lowest_value_step}) | Số kỷ nguyên còn lại để tập luyện quá sức: g/total: {remaining_epochs_gen} d/total: {remaining_epochs_disc} | làm mịn mất mát gen={smoothed_value_gen} | làm mịn mất mát disc={smoothed_value_disc}",
+    "model_training_info_2": "{model_name} | kỷ nguyên={epoch} | bước={global_step} | {epoch_recorder} | giá trị thấp nhất={lowest_value_rounded} (kỷ nguyên {lowest_value_epoch} và bước {lowest_value_step})",
+    "model_training_info_3": "{model_name} | kỷ nguyên={epoch} | bước={global_step} | {epoch_recorder}",
+    "training_error": "Đã xảy ra lỗi khi huấn luyện mô hình:",
+    "separator_info": "Đang khởi tạo với đường dẫn đầu ra: {output_dir}, định dạng đầu ra: {output_format}",
+    "none_ffmpeg": "FFmpeg chưa được cài đặt. Vui lòng cài đặt FFmpeg để sử dụng gói này.",
+    "running_in_cpu": "Không thể cấu hình khả năng tăng tốc phần cứng, chạy ở chế độ CPU",
+    "running_in_cuda": "CUDA có sẵn trong Torch, cài đặt thiết bị Torch thành CUDA",
+    "running_in_amd": "AMD có sẵn trong Torch, cài đặt thiết bị Torch thành AMD",
+    "onnx_have": "ONNXruntime có sẵn {have}, cho phép tăng tốc",
+    "onnx_not_have": "{have} không có sẵn trong ONNXruntime, do đó khả năng tăng tốc sẽ KHÔNG được bật",
+    "download_error": "Không tải được tệp xuống từ {url}, mã phản hồi: {status_code}",
+    "vip_print": "Này bạn, nếu bạn chưa đăng ký, vui lòng cân nhắc việc hỗ trợ cho nhà phát triển của UVR, Anjok07 bằng cách đăng ký tại đây: https://patreon.com/uvr",
+    "loading_model": "Đang tải mô hình {model_filename}...",
+    "model_type_not_support": "Loại mô hình không được hỗ trợ: {model_type}",
+    "starting_separator": "Bắt đầu quá trình tách cho đường dẫn tập tin âm thanh",
+    "separator_success_3": "Quá trình tách hoàn tất.",
+    "separator_duration": "Thời gian tách",
+    "dims": "Không thể sử dụng mã hóa vị trí sin/cos với thứ nguyên lẻ (có dim={dims})",
+    "activation": "kích hoạt phải là relu/gelu, không phải {activation}",
+    "length_or_training_length": "Độ dài cho trước {length} dài hơn thời lượng huấn luyện {training_length}",
+    "type_not_valid": "Loại không hợp lệ cho",
+    "del_parameter": "Bỏ tham số không tồn tại ",
+    "convert_shape": "Hình dạng hỗn hợp chuyển đổi: {shape}",
+    "not_success": "Quá trình đăng không hoàn tất: ",
+    "resample_error": "Lỗi trong quá trình lấy mẫu lại",
+    "shapes": "Hình dạng",
+    "wav_resolution": "Loại độ phân giải",
+    "warnings": "Cảnh báo: Đã phát hiện các giá trị cực kỳ hung hãn",
+    "warnings_2": "Cảnh báo: Đã phát hiện NaN hoặc giá trị vô hạn trong đầu vào sóng. Hình dạng",
+    "process_file": "Đang xử lý tập tin... \n",
+    "save_instruments": "Lưu bản nhạc ngược...",
+    "assert": "Các tệp âm thanh phải có hình dạng giống nhau - Mix: {mixshape}, Inst: {instrumentalshape}",
+    "rubberband": "Không thể thực Rubberband. Vui lòng xác minh rằng Rubberband-cli đã được cài đặt.",
+    "rate": "Tỉ lệ phải hoàn toàn tích cực",
+    "gdown_error": "Không thể truy xuất liên kết công khai của tệp. Bạn có thể cần phải thay đổi quyền thành bất kỳ ai có liên kết hoặc đã có nhiều quyền truy cập.",
+    "gdown_value_error": "Phải chỉ định đường dẫn hoặc id",
+    "missing_url": "Thiếu đường dẫn",
+    "mac_not_match": "MAC không khớp",
+    "file_not_access": "Tệp tin không thể truy cập",
+    "int_resp==-3": "Yêu cầu không hoàn tất, đang thử lại",
+    "search_separate": "Tìm bản tách...",
+    "found_choice": "Tìm thấy {choice}",
+    "separator==0": "Không tìm thấy bản tách nào!",
+    "select_separate": "Chọn bản tách",
+    "start_app": "Khởi động giao diện...",
+    "provide_audio": "Nhập đường dẫn đến tệp âm thanh",
+    "set_torch_mps": "Cài đặt thiết bị Torch thành MPS",
+    "googletts": "Chuyển đổi văn bản bằng google",
+    "pitch_info_2": "Cao độ giọng nói của bộ chuyển đổi văn bản",
+    "waveform": "Dạng sóng phải có hình dạng (# khung, # kênh)",
+    "freq_mask_smooth_hz": "freq_mask_smooth_hz cần ít nhất là {hz}Hz",
+    "time_mask_smooth_ms": "time_mask_smooth_ms cần ít nhất là {ms}ms",
+    "x": "x phải lớn hơn",
+    "xn": "xn phải lớn hơn",
+    "not_found_pid": "Không thấy tiến trình nào!",
+    "end_pid": "Đã kết thúc tiến trình!",
+    "not_found_separate_model": "Không tìm thấy tệp mô hình tách nhạc nào!",
+    "not_found_pretrained": "Không tìm thấy tệp mô hình huấn luyện trước nào!",
+    "not_found_log": "Không tìm thấy tệp nhật ký nào!",
+    "not_found_predictors": "Không tìm thấy tệp mô hình dự đoán nào!",
+    "not_found_embedders": "Không tìm thấy tệp mô hình nhúng nào!",
+    "provide_folder": "Vui lòng cung cấp thư mục hợp lệ!",
+    "empty_folder": "Thư mục dữ liệu trống!",
+    "vocoder": "Bộ mã hóa",
+    "vocoder_info": "Bộ mã hóa giọng nói dùng để phân tích và tổng hợp tín hiệu giọng nói của con người để chuyển đổi giọng nói.\n\nDefault: Tùy chọn này là HiFi-GAN-NSF, tương thích với tất cả các RVC\n\nMRF-HiFi-GAN: Độ trung thực cao hơn.\n\nRefineGAN: Chất lượng âm thanh vượt trội.",
+    "code_error": "Lỗi: Nhận mã trạng thái",
+    "json_error": "Lỗi: Không thể phân tích từ phản hồi.",
+    "requests_error": "Yêu cầu thất bại: {e}",
+    "memory_efficient_training": "Sử dụng hiệu quả bộ nhớ",
+    "not_use_pretrain_error_download": "Sẽ không dùng huấn luyện trước vì không có mô hình",
+    "provide_file_settings": "Vui lòng cung cấp tệp cài đặt trước!",
+    "load_presets": "Đã tải tệp cài đặt trước {presets}",
+    "provide_filename_settings": "Vui lòng cung cấp tên tệp cài đặt trước!",
+    "choose1": "Vui lòng chọn 1 để xuất!",
+    "export_settings": "Đã xuất tệp cài đặt trước {name}",
+    "use_presets": "Sử dụng tệp cài đặt trước",
+    "file_preset": "Tệp cài đặt trước",
+    "load_file": "Tải tệp",
+    "export_file": "Xuất tệp cài đặt trước",
+    "save_clean": "Lưu làm sạch",
+    "save_autotune": "Lưu tự điều chỉnh",
+    "save_pitch": "Lưu cao độ",
+    "save_index_2": "Lưu ảnh hưởng chỉ mục",
+    "save_resample": "Lưu lấy mẫu lại",
+    "save_filter": "Lưu trung vị",
+    "save_envelope": "Lưu đường bao âm",
+    "save_protect": "Lưu bảo vệ âm",
+    "save_split": "Lưu cắt âm",
+    "filename_to_save": "Tên khi lưu tệp",
+    "upload_presets": "Tải lên tệp cài đặt",
+    "stop": "Dừng tiến trình",
+    "stop_separate": "Dừng Tách Nhạc",
+    "stop_convert": "Dừng Chuyển Đổi",
+    "stop_create_dataset": "Dừng Tạo Dữ Liệu",
+    "stop_training": "Dừng Huấn Luyện",
+    "stop_preprocess": "Dừng Xử Lí Dữ Liệu",
+    "stop_extract": "Dừng Trích Xuất Dữ Liệu",
+    "not_found_presets": "Không tìm thấy tệp cài đặt sẳn nào trong thư mục!",
+    "port": "Cổng {port} không thể dùng! Giảm cổng xuống một...",
+    "empty_json": "{file}: Bị lỗi hoặc trống",
+    "thank": "Cảm ơn bạn đã báo cáo lỗi và cũng xin lỗi bạn vì sự bất tiện do lỗi gây ra này!",
+    "error_read_log": "Đã xảy ra lỗi khi đọc các tệp nhật ký!",
+    "error_send": "Đã xảy ra lỗi khi gửi báo cáo! Hãy liên hệ tôi qua Discord: pham_huynh_anh!",
+    "report_bugs": "Báo Cáo Lỗi",
+    "agree_log": "Đồng ý cung cấp tất cả tệp nhật ký",
+    "error_info": "Mô tả lỗi",
+    "error_info_2": "Cung cấp thêm thông tin về lỗi",
+    "report_bug_info": "Báo cáo các lỗi xảy ra khi sử dụng chương trình",
+    "sr_info": "LƯU Ý: MỘT SỐ ĐỊNH DẠNG KHÔNG HỖ TRỢ TRÊN 48000",
+    "report_info": "Nếu được bạn hãy đồng ý cung cấp các tệp nhật ký để hỗ trợ quá trình sửa lỗi\n\nNếu không cung cấp các tệp nhật ký bạn hãy mô tả chi tiết lỗi, lỗi xảy ra khi nào ở đâu\n\nNếu hệ thống báo cáo này bị lỗi nốt thì bạn có thể liên hệ qua [ISSUE]({github}) hoặc discord: `pham_huynh_anh`",
+    "default_setting": "Đã xảy ra lỗi khi sử dụng tách, đặt tất cả cài đặt về mặc định...",
+    "dataset_folder1": "Vui lòng nhập tên thư mục dữ liệu",
+    "checkpointing_err": "Các tham số của mô hình đào tạo trước như tốc độ mẫu hoặc kiến trúc không khớp với mô hình đã chọn.",
+    "start_onnx_export": "Bắt đầu chuyển đổi mô hình sang dạng onnx...",
+    "convert_model": "Chuyển Đổi Mô Hình",
+    "pytorch2onnx": "Chuyển Đổi Mô Hình PYTORCH Sang ONNX",
+    "pytorch2onnx_markdown": "Chuyển đổi mô hình RVC từ dạng pytorch sang onnx để tối ưu cho việc chuyển đổi âm thanh",
+    "error_readfile": "Đã xảy ra lỗi khi đọc tệp!",
+    "f0_onnx_mode": "Chế độ F0 ONNX",
+    "f0_onnx_mode_info": "Trích xuất cao độ bằng mô hình ONNX có thể giúp tăng tốc độ",
+    "formantshift": "Dịch chuyển cao độ và âm sắc",
+    "formant_qfrency": "Tần số cho dịch chuyển định dạng",
+    "formant_timbre": "Âm sắc để chuyển đổi định dạng",
+    "time_frames": "Thời Gian (Khung)",
+    "Frequency": "Tần Số (Hz)",
+    "f0_extractor_tab": "Trích xuất F0",
+    "f0_extractor_markdown": "## Trích Xuất Cao Độ",
+    "f0_extractor_markdown_2": "Trích xuất cao độ F0 nhằm mục đích sử dụng cho suy luận chuyển đổi âm thanh",
+    "start_extract": "Bắt đầu quá trình trích xuất...",
+    "extract_done": "Hoàn tất quá trình trích xuất!",
+    "f0_file": "Sử dụng tệp F0 trích xuất trước",
+    "upload_f0": "Tải lên tệp F0",
+    "f0_file_2": "Tệp F0",
+    "clean_f0_file": "Dọp dẹp tệp F0",
+    "embed_mode": "Chế độ nhúng",
+    "embed_mode_info": "Trích xuất nhúng bằng các mô hình khác nhau",
+    "close": "Ứng dụng đang tắt...",
+    "start_whisper": "Bắt đầu nhận dạng giọng nói bằng Whisper...",
+    "whisper_done": "Đã nhận dạng giọng nói hoàn tất!",
+    "process_audio": "Xử lí trước âm thanh...",
+    "process_done_start_convert": "Hoàn tất xử lí âm thanh! tiến hành chuyển đổi âm thanh...",
+    "convert_with_whisper": "Chuyển Đổi Âm Thanh Với Whisper",
+    "convert_with_whisper_info": "Chuyển đổi âm thanh bằng mô hình giọng nói đã được huấn luyện kèm với mô hình Whisper để nhận diện giọng nói\n\nWhisper sẽ nhận dạng các giọng nói khác nhau sau đó cắt các giọng riêng ra rồi dùng mô hình RVC để chuyển đổi lại các phân đoạn đó\n\nMô hình Whisper có thể hoạt động không chính xác làm cho đầu ra có thể kì lạ",
+    "num_spk": "Số lượng giọng",
+    "num_spk_info": "Số lượng giọng nói có trong âm thanh",
+    "model_size": "Kích thước mô hình Whisper",
+    "model_size_info": "Kích thước mô hình Whisper\n\nCác mô hình large có thể đưa ra các đầu ra kì lạ",
+    "title": "Công cụ huấn luyện, chuyển đổi nhạc cụ và giọng nói chất lượng và hiệu suất cao đơn giản dành cho người Việt",
+    "fp16_not_support": "CPU, MPS và OCL Không hỗ trợ tốt fp16, chuyển đổi fp16 -> fp32",
+    "precision": "Độ chính xác",
+    "precision_info": "Độ chính xác của suy luận và huấn luyện mô hình\n\nLưu ý: CPU Không hỗ trợ fp16",
+    "update_precision": "Cập Nhật Độ Chính Xác",
+    "start_update_precision": "Bắt đầu cập nhật độ chính xác",
+    "deterministic": "Thuật toán xác định",
+    "deterministic_info": "Khi bật sẽ sử dụng các thuật toán có tính xác định cao, đảm bảo rằng mỗi lần chạy cùng một dữ liệu đầu vào sẽ cho kết quả giống nhau.\n\nKhi tắt có thể chọn các thuật toán tối ưu hơn nhưng có thể không hoàn toàn xác định, dẫn đến kết quả huấn luyện có sự khác biệt giữa các lần chạy.",
+    "benchmark": "Thuật toán điểm chuẩn",
+    "benchmark_info": "Khi bật sẽ thử nghiệm và chọn thuật toán tối ưu nhất cho phần cứng và kích thước cụ thể. Điều này có thể giúp tăng tốc độ huấn luyện.\n\nKhi tắt sẽ không thực hiện tối ưu thuật toán này, có thể làm giảm tốc độ nhưng đảm bảo rằng mỗi lần chạy sử dụng cùng một thuật toán, điều này hữu ích nếu bạn muốn tái tạo chính xác.",
+    "font": "Phông chữ",
+    "font_info": "Phông chữ của giao diện\n\nTruy cập vào [Google Font](https://fonts.google.com) để lựa phông yêu thích của bạn.",
+    "change_font": "Đổi Phông Chữ",
+    "f0_unlock": "Mở khóa tất cả",
+    "f0_unlock_info": "Mở khóa toàn bộ phương pháp trích xuất cao độ",
+    "srt": "Tệp SRT trống hoặc bị lỗi!",
+    "optimizer": "Trình Tối Ưu Hóa",
+    "optimizer_info": "Trình tối ưu hóa trong huấn luyện, AdamW là mặc định, RAdam là một trình tối ưu khác",
+    "main_volume": "Âm lượng tệp âm thanh chính",
+    "main_volume_info": "Âm lượng tệp âm thanh chính. Nên để từ -4 đến 0.",
+    "combination_volume": "Âm lượng tệp âm thanh kết hợp",
+    "combination_volume_info": "Âm lượng tệp âm thanh kết hợp. Nên để âm lượng của tệp kết hợp nhỏ hơn âm thanh chính.",
+    "inference": "Suy Luận",
+    "extra": "Thêm",
+    "running_local_url": "Giao Diện Đang Chạy Trên Liên Kết Cục Bộ",
+    "running_share_url": "Giao Diện Đang Chạy Trên Liên Kết Công Khai",
+    "translate": "Dịch",
+    "source_lang": "Ngôn ngữ đầu vào",
+    "target_lang": "Ngôn ngữ đầu ra",
+    "prompt_warning": "Vui lòng nhập văn bản để tiến hành dịch!",
+    "read_error": "Quá trình đọc tệp văn bản xảy ra lỗi!",
+    "quirk": "Hiệu Ứng Kỳ Quặc",
+    "quirk_info": "## Những Hiệu Ứng Kỳ Quặc Dành Cho Âm Thanh",
+    "quirk_label": "Các hiệu ứng kỳ quặc",
+    "quirk_label_info": "Các hiệu ứng kỳ quặc có thể sử dụng để áp dụng vào âm thanh",
+    "quirk_markdown": "Áp dụng những hiệu ứng kỳ quặc cho âm thanh của bạn để chúng chở nên kỳ quặc dị dạng.",
+    "gradio_start": "Giao diện đã tải thành công sau",
+    "quirk_choice": {"Ngẫu Nhiên": 0, "Vỡ Âm": 1, "Kinh Dị": 2, "Người Máy": 3, "Em bé": 4, "Trầm": 5, "Giật Giọng": 6,  "Người Già": 7, "Vọng Âm": 8, "Quỷ Dữ": 9, "Méo Giọng": 10, "Bán Hàng Trực Tuyến": 11, "Kéo Lê": 12, "Khó Chịu": 13, "Rè": 14, "Lỗi Mạng": 15, "Rối Loạn": 16},
+    "proposal_pitch": "Tự động đề xuất cao độ",
+    "hybrid_calc": "Tính toán Hybrid cho phương thức: {f0_method}...",
+    "proposal_f0": "Đã đề xuất cao độ: {up_key}",
+    "startautotune": "Bắt đầu tự động điều chỉnh cao độ...",
+    "proposal_pitch_threshold": "Tần số ước tính cao độ",
+    "proposal_pitch_threshold_info": "Tần số ước tính cao độ, đối với mô hình nam sử dụng ở mức 155.0 và mô hình nữ với mức 255.0",
+    "rms_start_extract": "Bắt đầu trích xuất năng lượng âm thanh với {num_processes} lõi...",
+    "rms_success_extract": "Quá trình trích xuất năng lượng đã hoàn tất vào {elapsed_time} giây.",
+    "train&energy": "Huấn luyện năng lượng",
+    "train&energy_info": "Huấn luyện mô hình với năng lượng RMS",
+    "editing": "Chỉnh Sửa",
+    "check_assets_error": "Tải xuống tài nguyên thất bại {count} lần liên tiếp! hãy tải xuống thủ công và đặt vào thư mục tài nguyên: https://huggingface.co/AnhP/Vietnamese-RVC-Project"
+}

assets/logs/mute/energy/mute.wav.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9d91e585d3f05aec8ae6c3d3143501154baa77c5d540ec376b27a14397bbba13
+size 1332

assets/logs/mute/f0/mute.wav.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b9acf9ab7facdb032e1d687fe35182670b0b94566c4b209ae48c239d19956a6
+size 1332

assets/logs/mute/f0_voiced/mute.wav.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:30792849c8e72d67e6691754077f2888b101cb741e9c7f193c91dd9692870c87
+size 2536

assets/logs/mute/sliced_audios/mute32000.wav ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9edcf85ec77e88bd01edf3d887bdc418d3596d573f7ad2694da546f41dae6baf
+size 192078

assets/logs/mute/sliced_audios/mute40000.wav ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:67a816e77b50cb9f016e49e5c01f07e080c4e3b82b7a8ac3e64bcb143f90f31b
+size 240078

assets/logs/mute/sliced_audios/mute48000.wav ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2f2bb4daaa106e351aebb001e5a25de985c0b472f22e8d60676bc924a79056ee
+size 288078

assets/logs/mute/sliced_audios_16k/mute.wav ADDED Viewed

Binary file (96.1 kB). View file

assets/logs/mute/v1_extracted/mute.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:64d5abbac078e19a3f649c0d78a02cb33a71407ded3ddf2db78e6b803d0c0126
+size 152704

assets/logs/mute/v1_extracted/mute_chinese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2be76c020ad3e6702eeb53fff2198b52662992cf04ab0e56da1cd8a5329f701d
+size 152704

assets/logs/mute/v1_extracted/mute_japanese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d34ecf8faaa25f0f1fd861560b9cef42b58554839b250d4ab7ec85e94821ef80
+size 152704

assets/logs/mute/v1_extracted/mute_korean.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cddd2e78b0f95103d411edffebcba1336e7ec9d86b80cab511724630a7dba775
+size 152704

assets/logs/mute/v1_extracted/mute_portuguese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:47fc86ed03de0ef68ee82e37196d8aa00f8d295531f2348aa3d51456301a0f2c
+size 152704

assets/logs/mute/v1_extracted/mute_spin.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ec393080466213b7eaa951526d632b92b4d3f9adc3f208bee6d8364fcc0ad0b
+size 152704

assets/logs/mute/v1_extracted/mute_vietnamese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fd3349e8ac6a75175cda8dd47c1a22d6aaeeb3af4cf7c7508285bbfa079fbae8
+size 152704

assets/logs/mute/v2_extracted/mute.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:16ef62b957887ac9f0913aa5158f18983afff1ef5a3e4c5fd067ac20fc380d54
+size 457856

assets/logs/mute/v2_extracted/mute_chinese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:493a057ea2d32cf9b66ec65c92d59b412084e24837900158e33ee6349285ff7f
+size 457856

assets/logs/mute/v2_extracted/mute_japanese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e72bd4d0a0be6a5435222d72bd754e87e53a9883c8f7e990fe903f8e1c5c3cdf
+size 457856

assets/logs/mute/v2_extracted/mute_korean.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e5259ee6ce3732527d6b2ad206c4178d15950ae41e6690f2620cb6b0e138ff13
+size 457856

assets/logs/mute/v2_extracted/mute_portuguese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4f6bd1ab41558b0b15d9a15a52dccb776b4d03ccd0a934aa8c0efa1e0f14129b
+size 457856

assets/logs/mute/v2_extracted/mute_spin.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:997d1ba7055e5567344fe2dbf34be29dd5de70a66cf1cfc32ff98326f957618d
+size 457856

assets/logs/mute/v2_extracted/mute_vietnamese.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3a2798cac1618dea9151a2620f4e6e05a0274db74889377bd3c44d27784200e7
+size 457856

assets/models/embedders/.gitattributes ADDED Viewed

File without changes

assets/models/predictors/.gitattributes ADDED Viewed

File without changes

assets/models/pretrained_custom/.gitattributes ADDED Viewed

File without changes

assets/models/pretrained_v1/.gitattributes ADDED Viewed

File without changes

assets/models/pretrained_v2/.gitattributes ADDED Viewed

File without changes

assets/models/speaker_diarization/assets/gpt2.tiktoken ADDED Viewed

The diff for this file is too large to render. See raw diff

assets/models/speaker_diarization/assets/mel_filters.npz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7450ae70723a5ef9d341e3cee628c7cb0177f36ce42c44b7ed2bf3325f0f6d4c
+size 4271

assets/models/speaker_diarization/assets/multilingual.tiktoken ADDED Viewed

The diff for this file is too large to render. See raw diff

assets/models/speaker_diarization/models/.gitattributes ADDED Viewed

File without changes

assets/models/uvr5/.gitattributes ADDED Viewed

File without changes

assets/presets/.gitattributes ADDED Viewed

File without changes

assets/weights/.gitattributes ADDED Viewed

File without changes

audios/.gitattributes ADDED Viewed

File without changes

dataset/.gitattributes ADDED Viewed

File without changes

main/app/app.py ADDED Viewed

	@@ -0,0 +1,87 @@

+import os
+import io
+import ssl
+import sys
+import time
+import codecs
+import logging
+import warnings
+import gradio as gr
+sys.path.append(os.getcwd())
+start_time = time.time()
+from main.app.tabs.extra.extra import extra_tab
+from main.app.tabs.editing.editing import editing_tab
+from main.app.tabs.training.training import training_tab
+from main.app.tabs.downloads.downloads import download_tab
+from main.app.tabs.inference.inference import inference_tab
+from main.app.variables import logger, config, translations, theme, font, configs, language, allow_disk
+ssl._create_default_https_context = ssl._create_unverified_context
+warnings.filterwarnings("ignore")
+for l in ["httpx", "gradio", "uvicorn", "httpcore", "urllib3"]:
+    logging.getLogger(l).setLevel(logging.ERROR)
+with gr.Blocks(title="📱 Vietnamese-RVC GUI BY ANH", theme=theme, css="<style> @import url('{fonts}'); * {{font-family: 'Courgette', cursive !important;}} body, html {{font-family: 'Courgette', cursive !important;}} h1, h2, h3, h4, h5, h6, p, button, input, textarea, label, span, div, select {{font-family: 'Courgette', cursive !important;}} </style>".format(fonts=font or "https://fonts.googleapis.com/css2?family=Courgette&display=swap")) as app:
+    gr.HTML("<h1 style='text-align: center;'>🎵VIETNAMESE RVC BY ANH🎵</h1>")
+    gr.HTML(f"<h3 style='text-align: center;'>{translations['title']}</h3>")
+    with gr.Tabs():
+        inference_tab()
+        editing_tab()
+        training_tab()
+        download_tab()
+        extra_tab(app)
+    with gr.Row():
+        gr.Markdown(translations["rick_roll"].format(rickroll=codecs.decode('uggcf://jjj.lbhghor.pbz/jngpu?i=qDj4j9JtKpD', 'rot13')))
+    with gr.Row():
+        gr.Markdown(translations["terms_of_use"])
+    with gr.Row():
+        gr.Markdown(translations["exemption"])
+    logger.info(config.device)
+    logger.info(translations["start_app"])
+    logger.info(translations["set_lang"].format(lang=language))
+    port = configs.get("app_port", 7860)
+    server_name = configs.get("server_name", "0.0.0.0")
+    share = "--share" in sys.argv
+    original_stdout = sys.stdout
+    sys.stdout = io.StringIO()
+    for i in range(configs.get("num_of_restart", 5)):
+        try:
+            _, _, share_url = app.queue().launch(
+                favicon_path=configs["ico_path"],
+                server_name=server_name,
+                server_port=port,
+                show_error=configs.get("app_show_error", False),
+                inbrowser="--open" in sys.argv,
+                share=share,
+                allowed_paths=allow_disk,
+                prevent_thread_lock=True,
+                quiet=True
+            )
+            break
+        except OSError:
+            logger.debug(translations["port"].format(port=port))
+            port -= 1
+        except Exception as e:
+            logger.error(translations["error_occurred"].format(e=e))
+            sys.exit(1)
+    sys.stdout = original_stdout
+    logger.info(f"{translations['running_local_url']}: {server_name}:{port}")
+    if share: logger.info(f"{translations['running_share_url']}: {share_url}")
+    logger.info(f"{translations['gradio_start']}: {(time.time() - start_time):.2f}s")
+    while 1:
+        time.sleep(5)

main/app/core/downloads.py ADDED Viewed

	@@ -0,0 +1,187 @@

+import os
+import re
+import sys
+import json
+import codecs
+import shutil
+import yt_dlp
+import warnings
+import requests
+from bs4 import BeautifulSoup
+sys.path.append(os.getcwd())
+from main.tools import huggingface, gdown, meganz, mediafire, pixeldrain
+from main.app.core.ui import gr_info, gr_warning, gr_error, process_output
+from main.app.variables import logger, translations, model_options, configs
+from main.app.core.process import move_files_from_directory, fetch_pretrained_data, extract_name_model
+def download_url(url):
+    if not url: return gr_warning(translations["provide_url"])
+    if not os.path.exists(configs["audios_path"]): os.makedirs(configs["audios_path"], exist_ok=True)
+    with warnings.catch_warnings():
+        warnings.filterwarnings("ignore")
+        ydl_opts = {
+            "format": "bestaudio/best",
+            "postprocessors": [{
+                "key": "FFmpegExtractAudio",
+                "preferredcodec": "wav",
+                "preferredquality": "192"
+            }],
+            "quiet": True,
+            "no_warnings": True,
+            "noplaylist": True,
+            "verbose": False
+        }
+        gr_info(translations["start"].format(start=translations["download_music"]))
+        with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+            audio_output = os.path.join(configs["audios_path"], re.sub(r'\s+', '-', re.sub(r'[^\w\s\u4e00-\u9fff\uac00-\ud7af\u0400-\u04FF\u1100-\u11FF]', '', ydl.extract_info(url, download=False).get('title', 'video')).strip()))
+            if os.path.exists(audio_output): shutil.rmtree(audio_output, ignore_errors=True)
+            ydl_opts['outtmpl'] = audio_output
+        with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+            audio_output = process_output(audio_output + ".wav")
+            ydl.download([url])
+        gr_info(translations["success"])
+        return [audio_output, audio_output, translations["success"]]
+def move_file(file, download_dir, model):
+    weights_dir = configs["weights_path"]
+    logs_dir = configs["logs_path"]
+    if not os.path.exists(weights_dir): os.makedirs(weights_dir, exist_ok=True)
+    if not os.path.exists(logs_dir): os.makedirs(logs_dir, exist_ok=True)
+    if file.endswith(".zip"): shutil.unpack_archive(file, download_dir)
+    move_files_from_directory(download_dir, weights_dir, logs_dir, model)
+def download_model(url=None, model=None):
+    if not url: return gr_warning(translations["provide_url"])
+    url = url.replace("/blob/", "/resolve/").replace("?download=true", "").strip()
+    download_dir = "download_model"
+    os.makedirs(download_dir, exist_ok=True)
+    try:
+        gr_info(translations["start"].format(start=translations["download"]))
+        if "huggingface.co" in url: file = huggingface.HF_download_file(url, download_dir)
+        elif "google.com" in url: file = gdown.gdown_download(url, download_dir)
+        elif "mediafire.com" in url: file = mediafire.Mediafire_Download(url, download_dir)
+        elif "pixeldrain.com" in url: file = pixeldrain.pixeldrain(url, download_dir)
+        elif "mega.nz" in url: file = meganz.mega_download_url(url, download_dir)
+        else:
+            gr_warning(translations["not_support_url"])
+            return translations["not_support_url"]
+        if not model:
+            modelname = os.path.basename(file)
+            model = extract_name_model(modelname) if modelname.endswith(".index") else os.path.splitext(modelname)[0]
+            if model is None: model = os.path.splitext(modelname)[0]
+        model = model.replace(".onnx", "").replace(".pth", "").replace(".index", "").replace(".zip", "").replace(" ", "_").replace("(", "").replace(")", "").replace("[", "").replace("]", "").replace("{", "").replace("}", "").replace(",", "").replace('"', "").replace("'", "").replace("|", "").strip()
+        move_file(file, download_dir, model)
+        gr_info(translations["success"])
+        return translations["success"]
+    except Exception as e:
+        gr_error(message=translations["error_occurred"].format(e=e))
+        return translations["error_occurred"].format(e=e)
+    finally:
+        shutil.rmtree(download_dir, ignore_errors=True)
+def download_pretrained_model(choices, model, sample_rate):
+    pretraineds_custom_path = configs["pretrained_custom_path"]
+    if choices == translations["list_model"]:
+        paths = fetch_pretrained_data()[model][sample_rate]
+        if not os.path.exists(pretraineds_custom_path): os.makedirs(pretraineds_custom_path, exist_ok=True)
+        url = codecs.decode("uggcf://uhttvatsnpr.pb/NauC/Ivrganzrfr-EIP-Cebwrpg/erfbyir/znva/cergenvarq_phfgbz/", "rot13") + paths
+        gr_info(translations["download_pretrain"])
+        file = huggingface.HF_download_file(url.replace("/blob/", "/resolve/").replace("?download=true", "").strip(), os.path.join(pretraineds_custom_path, paths))
+        if file.endswith(".zip"):
+            shutil.unpack_archive(file, pretraineds_custom_path)
+            os.remove(file)
+        gr_info(translations["success"])
+        return translations["success"], None
+    elif choices == translations["download_url"]:
+        if not model: return gr_warning(translations["provide_pretrain"].format(dg="D"))
+        if not sample_rate: return gr_warning(translations["provide_pretrain"].format(dg="G"))
+        gr_info(translations["download_pretrain"])
+        for url in [model, sample_rate]:
+            url = url.replace("/blob/", "/resolve/").replace("?download=true", "").strip()
+            if "huggingface.co" in url: huggingface.HF_download_file(url, pretraineds_custom_path)
+            elif "google.com" in url: gdown.gdown_download(url, pretraineds_custom_path)
+            elif "mediafire.com" in url: mediafire.Mediafire_Download(url, pretraineds_custom_path)
+            elif "pixeldrain.com" in url: pixeldrain.pixeldrain(url, pretraineds_custom_path)
+            elif "mega.nz" in url: meganz.mega_download_url(url, pretraineds_custom_path)
+            else:
+                gr_warning(translations["not_support_url"])
+                return translations["not_support_url"], translations["not_support_url"]
+        gr_info(translations["success"])
+        return translations["success"], translations["success"]
+def fetch_models_data(search):
+    all_table_data = []
+    page = 1
+    while 1:
+        try:
+            response = requests.post(url=codecs.decode("uggcf://ibvpr-zbqryf.pbz/srgpu_qngn.cuc", "rot13"), data={"page": page, "search": search})
+            if response.status_code == 200:
+                table_data = response.json().get("table", "")
+                if not table_data.strip(): break
+                all_table_data.append(table_data)
+                page += 1
+            else:
+                logger.debug(f"{translations['code_error']} {response.status_code}")
+                break
+        except json.JSONDecodeError:
+            logger.debug(translations["json_error"])
+            break
+        except requests.RequestException as e:
+            logger.debug(translations["requests_error"].format(e=e))
+            break
+    return all_table_data
+def search_models(name):
+    if not name: return gr_warning(translations["provide_name"])
+    gr_info(translations["start"].format(start=translations["search"]))
+    tables = fetch_models_data(name)
+    if len(tables) == 0:
+        gr_info(translations["not_found"].format(name=name))
+        return [None]*2
+    else:
+        model_options.clear()
+        for table in tables:
+            for row in BeautifulSoup(table, "html.parser").select("tr"):
+                name_tag, url_tag = row.find("a", {"class": "fs-5"}), row.find("a", {"class": "btn btn-sm fw-bold btn-light ms-0 p-1 ps-2 pe-2"})
+                url = url_tag["href"].replace("https://easyaivoice.com/run?url=", "")
+                if "huggingface" in url:
+                    if name_tag and url_tag: model_options[name_tag.text.replace(".onnx", "").replace(".pth", "").replace(".index", "").replace(".zip", "").replace(" ", "_").replace("(", "").replace(")", "").replace("[", "").replace("]", "").replace(",", "").replace('"', "").replace("'", "").replace("|", "_").replace("-_-", "_").replace("_-_", "_").replace("-", "_").replace("---", "_").replace("___", "_").strip()] = url
+        gr_info(translations["found"].format(results=len(model_options)))
+        return [{"value": "", "choices": model_options, "interactive": True, "visible": True, "__type__": "update"}, {"value": translations["downloads"], "visible": True, "__type__": "update"}]

main/app/core/editing.py ADDED Viewed

	@@ -0,0 +1,96 @@

+import os
+import sys
+import random
+import librosa
+import subprocess
+import numpy as np
+import soundfile as sf
+sys.path.append(os.getcwd())
+from main.app.core.ui import gr_info, gr_warning, process_output
+from main.app.variables import python, translations, configs, config
+def audio_effects(input_path, output_path, resample, resample_sr, chorus_depth, chorus_rate, chorus_mix, chorus_delay, chorus_feedback, distortion_drive, reverb_room_size, reverb_damping, reverb_wet_level, reverb_dry_level, reverb_width, reverb_freeze_mode, pitch_shift, delay_seconds, delay_feedback, delay_mix, compressor_threshold, compressor_ratio, compressor_attack_ms, compressor_release_ms, limiter_threshold, limiter_release, gain_db, bitcrush_bit_depth, clipping_threshold, phaser_rate_hz, phaser_depth, phaser_centre_frequency_hz, phaser_feedback, phaser_mix, bass_boost_db, bass_boost_frequency, treble_boost_db, treble_boost_frequency, fade_in_duration, fade_out_duration, export_format, chorus, distortion, reverb, delay, compressor, limiter, gain, bitcrush, clipping, phaser, treble_bass_boost, fade_in_out, audio_combination, audio_combination_input, main_vol, combine_vol):
+    if not input_path or not os.path.exists(input_path) or os.path.isdir(input_path):
+        gr_warning(translations["input_not_valid"])
+        return None
+    if not output_path:
+        gr_warning(translations["output_not_valid"])
+        return None
+    if os.path.isdir(output_path): output_path = os.path.join(output_path, f"audio_effects.{export_format}")
+    output_dir = os.path.dirname(output_path) or output_path
+    if not os.path.exists(output_dir): os.makedirs(output_dir, exist_ok=True)
+    output_path = process_output(output_path)
+    gr_info(translations["start"].format(start=translations["apply_effect"]))
+    if config.debug_mode: subprocess.run([python, configs["audio_effects_path"], "--input_path", input_path, "--output_path", output_path, "--resample", str(resample), "--resample_sr", str(resample_sr), "--chorus_depth", str(chorus_depth), "--chorus_rate", str(chorus_rate), "--chorus_mix", str(chorus_mix), "--chorus_delay", str(chorus_delay), "--chorus_feedback", str(chorus_feedback), "--drive_db", str(distortion_drive), "--reverb_room_size", str(reverb_room_size), "--reverb_damping", str(reverb_damping), "--reverb_wet_level", str(reverb_wet_level), "--reverb_dry_level", str(reverb_dry_level), "--reverb_width", str(reverb_width), "--reverb_freeze_mode", str(reverb_freeze_mode), "--pitch_shift", str(pitch_shift), "--delay_seconds", str(delay_seconds), "--delay_feedback", str(delay_feedback), "--delay_mix", str(delay_mix), "--compressor_threshold", str(compressor_threshold), "--compressor_ratio", str(compressor_ratio), "--compressor_attack_ms", str(compressor_attack_ms), "--compressor_release_ms", str(compressor_release_ms), "--limiter_threshold", str(limiter_threshold), "--limiter_release", str(limiter_release), "--gain_db", str(gain_db), "--bitcrush_bit_depth", str(bitcrush_bit_depth), "--clipping_threshold", str(clipping_threshold), "--phaser_rate_hz", str(phaser_rate_hz), "--phaser_depth", str(phaser_depth), "--phaser_centre_frequency_hz", str(phaser_centre_frequency_hz), "--phaser_feedback", str(phaser_feedback), "--phaser_mix", str(phaser_mix), "--bass_boost_db", str(bass_boost_db), "--bass_boost_frequency", str(bass_boost_frequency), "--treble_boost_db", str(treble_boost_db), "--treble_boost_frequency", str(treble_boost_frequency), "--fade_in_duration", str(fade_in_duration), "--fade_out_duration", str(fade_out_duration), "--export_format", export_format, "--chorus", str(chorus), "--distortion", str(distortion), "--reverb", str(reverb), "--pitchshift", str(pitch_shift != 0), "--delay", str(delay), "--compressor", str(compressor), "--limiter", str(limiter), "--gain", str(gain), "--bitcrush", str(bitcrush), "--clipping", str(clipping), "--phaser", str(phaser), "--treble_bass_boost", str(treble_bass_boost), "--fade_in_out", str(fade_in_out), "--audio_combination", str(audio_combination), "--audio_combination_input", audio_combination_input, "--main_volume", str(main_vol), "--combination_volume", str(combine_vol)])
+    else:
+        from main.inference.audio_effects import process_audio
+        process_audio(input_path, output_path, resample, resample_sr, chorus_depth, chorus_rate, chorus_mix, chorus_delay, chorus_feedback, distortion_drive, reverb_room_size, reverb_damping, reverb_wet_level, reverb_dry_level, reverb_width, reverb_freeze_mode, pitch_shift, delay_seconds, delay_feedback, delay_mix, compressor_threshold, compressor_ratio, compressor_attack_ms, compressor_release_ms, limiter_threshold, limiter_release, gain_db, bitcrush_bit_depth, clipping_threshold, phaser_rate_hz, phaser_depth, phaser_centre_frequency_hz, phaser_feedback, phaser_mix, bass_boost_db, bass_boost_frequency, treble_boost_db, treble_boost_frequency, fade_in_duration, fade_out_duration, export_format, chorus, distortion, reverb, pitch_shift != 0, delay, compressor, limiter, gain, bitcrush, clipping, phaser, treble_bass_boost, fade_in_out, audio_combination, audio_combination_input, main_vol, combine_vol)
+    gr_info(translations["success"])
+    return output_path.replace("wav", export_format)
+def vibrato(y, sr, freq=5, depth=0.003):
+    return y[np.clip((np.arange(len(y)) + (depth * np.sin(2 * np.pi * freq * (np.arange(len(y)) / sr))) * sr).astype(int), 0, len(y) - 1)]
+def apply_voice_quirk(audio_path, mode, output_path, export_format):
+    if not audio_path or not os.path.exists(audio_path) or os.path.isdir(audio_path):
+        gr_warning(translations["input_not_valid"])
+        return None
+    if not output_path:
+        gr_warning(translations["output_not_valid"])
+        return None
+    if os.path.isdir(output_path): output_path = os.path.join(output_path, f"audio_quirk.{export_format}")
+    output_dir = os.path.dirname(output_path) or output_path
+    if not os.path.exists(output_dir): os.makedirs(output_dir, exist_ok=True)
+    output_path = process_output(output_path)
+    gr_info(translations["start"].format(start=translations["apply_effect"]))
+    y, sr = librosa.load(audio_path, sr=None)
+    output_path = output_path.replace("wav", export_format)
+    mode = translations["quirk_choice"][mode]
+    if mode == 0: mode = random.randint(1, 16)
+    if mode == 1: y *= np.random.uniform(0.5, 0.8, size=len(y))
+    elif mode == 2: y = librosa.effects.pitch_shift(y=y + np.random.normal(0, 0.01, y.shape), sr=sr, n_steps=np.random.uniform(-1.5, -3.5))
+    elif mode == 3: y = librosa.effects.time_stretch(librosa.effects.pitch_shift(y=y, sr=sr, n_steps=3), rate=1.2)
+    elif mode == 4: y = librosa.effects.time_stretch(librosa.effects.pitch_shift(y=y, sr=sr, n_steps=8), rate=1.3)
+    elif mode == 5: y = librosa.effects.time_stretch(librosa.effects.pitch_shift(y=y, sr=sr, n_steps=-3), rate=0.75)
+    elif mode == 6: y *= np.sin(np.linspace(0, np.pi * 20, len(y))) * 0.5 + 0.5
+    elif mode == 7: y = librosa.effects.time_stretch(vibrato(librosa.effects.pitch_shift(y=y, sr=sr, n_steps=-4), sr, freq=3, depth=0.004), rate=0.85)
+    elif mode == 8: y *= 0.6 + np.pad(y, (sr // 2, 0), mode='constant')[:len(y)] * 0.4
+    elif mode == 9: y = librosa.effects.pitch_shift(y=y, sr=sr, n_steps=2) + np.sin(np.linspace(0, np.pi * 20, len(y))) * 0.02
+    elif mode == 10: y = vibrato(y, sr, freq=8, depth=0.005)
+    elif mode == 11: y = librosa.effects.time_stretch(librosa.effects.pitch_shift(y=y, sr=sr, n_steps=4), rate=1.25)
+    elif mode == 12: y = np.hstack([np.pad(f, (0, int(len(f)*0.3)), mode='edge') for f in librosa.util.frame(y, frame_length=2048, hop_length=512).T])
+    elif mode == 13: y = np.concatenate([y, np.sin(2 * np.pi * np.linspace(0, 1, int(0.05 * sr))) * 0.02])
+    elif mode == 14: y += np.random.normal(0, 0.005, len(y))
+    elif mode == 15:
+        frame = int(sr * 0.2)
+        chunks = [y[i:i + frame] for i in range(0, len(y), frame)]
+        np.random.shuffle(chunks)
+        y = np.concatenate(chunks)
+    elif mode == 16:
+        frame = int(sr * 0.3)
+        for i in range(0, len(y), frame * 2):
+            y[i:i+frame] = y[i:i+frame][::-1]
+    sf.write(output_path, y, sr, format=export_format)
+    gr_info(translations["success"])
+    return output_path

main/app/core/f0_extract.py ADDED Viewed

	@@ -0,0 +1,54 @@

+import os
+import sys
+import librosa
+import numpy as np
+import matplotlib.pyplot as plt
+sys.path.append(os.getcwd())
+from main.library.utils import check_assets
+from main.app.core.ui import gr_info, gr_warning
+from main.library.predictors.Generator import Generator
+from main.app.variables import config, translations, configs
+def f0_extract(audio, f0_method, f0_onnx):
+    if not audio or not os.path.exists(audio) or os.path.isdir(audio):
+        gr_warning(translations["input_not_valid"])
+        return [None]*2
+    check_assets(f0_method, None, f0_onnx, None)
+    f0_path = os.path.join(configs["f0_path"], os.path.splitext(os.path.basename(audio))[0])
+    image_path = os.path.join(f0_path, "f0.png")
+    txt_path = os.path.join(f0_path, "f0.txt")
+    gr_info(translations["start_extract"])
+    if not os.path.exists(f0_path): os.makedirs(f0_path, exist_ok=True)
+    y, sr = librosa.load(audio, sr=None)
+    f0_generator = Generator(sr, 160, 50, 1600, is_half=config.is_half, device=config.device, f0_onnx_mode=f0_onnx, del_onnx_model=f0_onnx)
+    _, pitchf = f0_generator.calculator(config.x_pad, f0_method, y, 0, None, 3, False, 0, None, False)
+    F_temp = np.array(pitchf, dtype=np.float32)
+    F_temp[F_temp == 0] = np.nan
+    f0 = 1200 * np.log2(F_temp / librosa.midi_to_hz(0))
+    plt.figure(figsize=(10, 4))
+    plt.plot(f0)
+    plt.title(f0_method)
+    plt.xlabel(translations["time_frames"])
+    plt.ylabel(translations["Frequency"])
+    plt.savefig(image_path)
+    plt.close()
+    with open(txt_path, "w") as f:
+        for i, f0_value in enumerate(f0):
+            f.write(f"{i * sr / 160},{f0_value}\n")
+    gr_info(translations["extract_done"])
+    return [txt_path, image_path]

main/app/core/inference.py ADDED Viewed

	@@ -0,0 +1,387 @@

+import os
+import re
+import sys
+import shutil
+import librosa
+import datetime
+import subprocess
+import numpy as np
+sys.path.append(os.getcwd())
+from main.app.core.ui import gr_info, gr_warning, gr_error, process_output
+from main.app.variables import logger, config, configs, translations, python
+def convert(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0_method, input_path, output_path, pth_path, index_path, f0_autotune, clean_audio, clean_strength, export_format, embedder_model, resample_sr, split_audio, f0_autotune_strength, checkpointing, f0_onnx, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, proposal_pitch, proposal_pitch_threshold):
+    if config.debug_mode: subprocess.run([python, configs["convert_path"], "--pitch", str(pitch), "--filter_radius", str(filter_radius), "--index_rate", str(index_rate), "--rms_mix_rate", str(rms_mix_rate), "--protect", str(protect), "--hop_length", str(hop_length), "--f0_method", f0_method, "--input_path", input_path, "--output_path", output_path, "--pth_path", pth_path, "--index_path", index_path, "--f0_autotune", str(f0_autotune), "--clean_audio", str(clean_audio), "--clean_strength", str(clean_strength), "--export_format", export_format, "--embedder_model", embedder_model, "--resample_sr", str(resample_sr), "--split_audio", str(split_audio), "--f0_autotune_strength", str(f0_autotune_strength), "--checkpointing", str(checkpointing), "--f0_onnx", str(f0_onnx), "--embedders_mode", embedders_mode, "--formant_shifting", str(formant_shifting), "--formant_qfrency", str(formant_qfrency), "--formant_timbre", str(formant_timbre), "--f0_file", f0_file, "--proposal_pitch", str(proposal_pitch), "--proposal_pitch_threshold", str(proposal_pitch_threshold)])
+    else:
+        from main.inference.conversion.convert import run_convert_script
+        run_convert_script(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0_method, input_path, output_path, pth_path, index_path, f0_autotune, f0_autotune_strength, clean_audio, clean_strength, export_format, embedder_model, resample_sr, split_audio, checkpointing, f0_file, f0_onnx, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, proposal_pitch, proposal_pitch_threshold)
+def convert_audio(clean, autotune, use_audio, use_original, convert_backing, not_merge_backing, merge_instrument, pitch, clean_strength, model, index, index_rate, input, output, format, method, hybrid_method, hop_length, embedders, custom_embedders, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, input_audio_name, checkpointing, onnx_f0_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, embedders_mode, proposal_pitch, proposal_pitch_threshold):
+    model_path = os.path.join(configs["weights_path"], model) if not os.path.exists(model) else model
+    return_none = [None]*6
+    return_none[5] = {"visible": True, "__type__": "update"}
+    if not use_audio:
+        if merge_instrument or not_merge_backing or convert_backing or use_original:
+            gr_warning(translations["turn_on_use_audio"])
+            return return_none
+    if use_original:
+        if convert_backing:
+            gr_warning(translations["turn_off_convert_backup"])
+            return return_none
+        elif not_merge_backing:
+            gr_warning(translations["turn_off_merge_backup"])
+            return return_none
+    if not model or not os.path.exists(model_path) or os.path.isdir(model_path) or not model.endswith((".pth", ".onnx")):
+        gr_warning(translations["provide_file"].format(filename=translations["model"]))
+        return return_none
+    f0method, embedder_model = (method if method != "hybrid" else hybrid_method), (embedders if embedders != "custom" else custom_embedders)
+    if use_audio:
+        output_audio = os.path.join(configs["audios_path"], input_audio_name)
+        from main.library.utils import pydub_load
+        def get_audio_file(label):
+            matching_files = [f for f in os.listdir(output_audio) if label in f]
+            if not matching_files: return translations["notfound"]
+            return os.path.join(output_audio, matching_files[0])
+        output_path = os.path.join(output_audio, f"Convert_Vocals.{format}")
+        output_backing = os.path.join(output_audio, f"Convert_Backing.{format}")
+        output_merge_backup = os.path.join(output_audio, f"Vocals+Backing.{format}")
+        output_merge_instrument = os.path.join(output_audio, f"Vocals+Instruments.{format}")
+        if os.path.exists(output_audio): os.makedirs(output_audio, exist_ok=True)
+        output_path = process_output(output_path)
+        if use_original:
+            original_vocal = get_audio_file('Original_Vocals_No_Reverb.')
+            if original_vocal == translations["notfound"]: original_vocal = get_audio_file('Original_Vocals.')
+            if original_vocal == translations["notfound"]:
+                gr_warning(translations["not_found_original_vocal"])
+                return return_none
+            input_path = original_vocal
+        else:
+            main_vocal = get_audio_file('Main_Vocals_No_Reverb.')
+            backing_vocal = get_audio_file('Backing_Vocals_No_Reverb.')
+            if main_vocal == translations["notfound"]: main_vocal = get_audio_file('Main_Vocals.')
+            if not not_merge_backing and backing_vocal == translations["notfound"]: backing_vocal = get_audio_file('Backing_Vocals.')
+            if main_vocal == translations["notfound"]:
+                gr_warning(translations["not_found_main_vocal"])
+                return return_none
+            if not not_merge_backing and backing_vocal == translations["notfound"]:
+                gr_warning(translations["not_found_backing_vocal"])
+                return return_none
+            input_path = main_vocal
+            backing_path = backing_vocal
+        gr_info(translations["convert_vocal"])
+        convert(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0method, input_path, output_path, model_path, index, autotune, clean, clean_strength, format, embedder_model, resample_sr, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, proposal_pitch, proposal_pitch_threshold)
+        gr_info(translations["convert_success"])
+        if convert_backing:
+            output_backing = process_output(output_backing)
+            gr_info(translations["convert_backup"])
+            convert(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0method, backing_path, output_backing, model_path, index, autotune, clean, clean_strength, format, embedder_model, resample_sr, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, proposal_pitch, proposal_pitch_threshold)
+            gr_info(translations["convert_backup_success"])
+        try:
+            if not not_merge_backing and not use_original:
+                backing_source = output_backing if convert_backing else backing_vocal
+                output_merge_backup = process_output(output_merge_backup)
+                gr_info(translations["merge_backup"])
+                pydub_load(output_path, volume=-4).overlay(pydub_load(backing_source, volume=-6)).export(output_merge_backup, format=format)
+                gr_info(translations["merge_success"])
+            if merge_instrument:
+                vocals = output_merge_backup if not not_merge_backing and not use_original else output_path
+                output_merge_instrument = process_output(output_merge_instrument)
+                gr_info(translations["merge_instruments_process"])
+                instruments = get_audio_file('Instruments.')
+                if instruments == translations["notfound"]:
+                    gr_warning(translations["not_found_instruments"])
+                    output_merge_instrument = None
+                else: pydub_load(instruments, volume=-7).overlay(pydub_load(vocals, volume=-4 if use_original else None)).export(output_merge_instrument, format=format)
+                gr_info(translations["merge_success"])
+        except:
+            return return_none
+        return [(None if use_original else output_path), output_backing, (None if not_merge_backing and use_original else output_merge_backup), (output_path if use_original else None), (output_merge_instrument if merge_instrument else None), {"visible": True, "__type__": "update"}]
+    else:
+        if not input or not os.path.exists(input) or os.path.isdir(input):
+            gr_warning(translations["input_not_valid"])
+            return return_none
+        if not output:
+            gr_warning(translations["output_not_valid"])
+            return return_none
+        output = output.replace("wav", format)
+        if os.path.isdir(input):
+            gr_info(translations["is_folder"])
+            if not [f for f in os.listdir(input) if f.lower().endswith(("wav", "mp3", "flac", "ogg", "opus", "m4a", "mp4", "aac", "alac", "wma", "aiff", "webm", "ac3"))]:
+                gr_warning(translations["not_found_in_folder"])
+                return return_none
+            gr_info(translations["batch_convert"])
+            output_dir = os.path.dirname(output) or output
+            convert(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0method, input, output_dir, model_path, index, autotune, clean, clean_strength, format, embedder_model, resample_sr, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, proposal_pitch, proposal_pitch_threshold)
+            gr_info(translations["batch_convert_success"])
+            return return_none
+        else:
+            output_dir = os.path.dirname(output) or output
+            if not os.path.exists(output_dir): os.makedirs(output_dir, exist_ok=True)
+            output = process_output(output)
+            gr_info(translations["convert_vocal"])
+            convert(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0method, input, output, model_path, index, autotune, clean, clean_strength, format, embedder_model, resample_sr, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, proposal_pitch, proposal_pitch_threshold)
+            gr_info(translations["convert_success"])
+            return_none[0] = output
+            return return_none
+def convert_selection(clean, autotune, use_audio, use_original, convert_backing, not_merge_backing, merge_instrument, pitch, clean_strength, model, index, index_rate, input, output, format, method, hybrid_method, hop_length, embedders, custom_embedders, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, embedders_mode, proposal_pitch, proposal_pitch_threshold):
+    if use_audio:
+        gr_info(translations["search_separate"])
+        choice = [f for f in os.listdir(configs["audios_path"]) if os.path.isdir(os.path.join(configs["audios_path"], f))] if config.debug_mode else [f for f in os.listdir(configs["audios_path"]) if os.path.isdir(os.path.join(configs["audios_path"], f)) and any(file.lower().endswith((".wav", ".mp3", ".flac", ".ogg", ".opus", ".m4a", ".mp4", ".aac", ".alac", ".wma", ".aiff", ".webm", ".ac3")) for file in os.listdir(os.path.join(configs["audios_path"], f)))]
+        gr_info(translations["found_choice"].format(choice=len(choice)))
+        if len(choice) == 0:
+            gr_warning(translations["separator==0"])
+            return [{"choices": [], "value": "", "interactive": False, "visible": False, "__type__": "update"}, None, None, None, None, None, {"visible": True, "__type__": "update"}, {"visible": False, "__type__": "update"}]
+        elif len(choice) == 1:
+            convert_output = convert_audio(clean, autotune, use_audio, use_original, convert_backing, not_merge_backing, merge_instrument, pitch, clean_strength, model, index, index_rate, None, None, format, method, hybrid_method, hop_length, embedders, custom_embedders, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, choice[0], checkpointing, onnx_f0_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, embedders_mode, proposal_pitch, proposal_pitch_threshold)
+            return [{"choices": [], "value": "", "interactive": False, "visible": False, "__type__": "update"}, convert_output[0], convert_output[1], convert_output[2], convert_output[3], convert_output[4], {"visible": True, "__type__": "update"}, {"visible": False, "__type__": "update"}]
+        else: return [{"choices": choice, "value": choice[0], "interactive": True, "visible": True, "__type__": "update"}, None, None, None, None, None, {"visible": False, "__type__": "update"}, {"visible": True, "__type__": "update"}]
+    else:
+        main_convert = convert_audio(clean, autotune, use_audio, use_original, convert_backing, not_merge_backing, merge_instrument, pitch, clean_strength, model, index, index_rate, input, output, format, method, hybrid_method, hop_length, embedders, custom_embedders, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, None, checkpointing, onnx_f0_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, embedders_mode, proposal_pitch, proposal_pitch_threshold)
+        return [{"choices": [], "value": "", "interactive": False, "visible": False, "__type__": "update"}, main_convert[0], None, None, None, None, {"visible": True, "__type__": "update"}, {"visible": False, "__type__": "update"}]
+def convert_with_whisper(num_spk, model_size, cleaner, clean_strength, autotune, f0_autotune_strength, checkpointing, model_1, model_2, model_index_1, model_index_2, pitch_1, pitch_2, index_strength_1, index_strength_2, export_format, input_audio, output_audio, onnx_f0_mode, method, hybrid_method, hop_length, embed_mode, embedders, custom_embedders, resample_sr, filter_radius, rms_mix_rate, protect, formant_shifting, formant_qfrency_1, formant_timbre_1, formant_qfrency_2, formant_timbre_2, proposal_pitch, proposal_pitch_threshold):
+    from pydub import AudioSegment
+    from sklearn.cluster import AgglomerativeClustering
+    from main.library.speaker_diarization.audio import Audio
+    from main.library.speaker_diarization.segment import Segment
+    from main.library.speaker_diarization.whisper import load_model
+    from main.library.utils import check_spk_diarization, pydub_load
+    from main.library.speaker_diarization.embedding import SpeechBrainPretrainedSpeakerEmbedding
+    check_spk_diarization(model_size)
+    model_pth_1, model_pth_2 = os.path.join(configs["weights_path"], model_1) if not os.path.exists(model_1) else model_1, os.path.join(configs["weights_path"], model_2) if not os.path.exists(model_2) else model_2
+    if (not model_1 or not os.path.exists(model_pth_1) or os.path.isdir(model_pth_1) or not model_pth_1.endswith((".pth", ".onnx"))) and (not model_2 or not os.path.exists(model_pth_2) or os.path.isdir(model_pth_2) or not model_pth_2.endswith((".pth", ".onnx"))):
+        gr_warning(translations["provide_file"].format(filename=translations["model"]))
+        return None
+    if not model_1: model_pth_1 = model_pth_2
+    if not model_2: model_pth_2 = model_pth_1
+    if not input_audio or not os.path.exists(input_audio) or os.path.isdir(input_audio):
+        gr_warning(translations["input_not_valid"])
+        return None
+    if not output_audio:
+        gr_warning(translations["output_not_valid"])
+        return None
+    output_audio = process_output(output_audio)
+    gr_info(translations["start_whisper"])
+    try:
+        audio = Audio()
+        embedding_model = SpeechBrainPretrainedSpeakerEmbedding(embedding=os.path.join(configs["speaker_diarization_path"], "models", "speechbrain"), device=config.device)
+        segments = load_model(model_size, device=config.device).transcribe(input_audio, fp16=configs.get("fp16", False), word_timestamps=True)["segments"]
+        y, sr = librosa.load(input_audio, sr=None)
+        duration = len(y) / sr
+        def segment_embedding(segment):
+            waveform, _ = audio.crop(input_audio, Segment(segment["start"], min(duration, segment["end"])))
+            return embedding_model(waveform.mean(dim=0, keepdim=True)[None] if waveform.shape[0] == 2 else waveform[None])
+        def time(secs):
+            return datetime.timedelta(seconds=round(secs))
+        def merge_audio(files_list, time_stamps, original_file_path, output_path, format):
+            def extract_number(filename):
+                match = re.search(r'_(\d+)', filename)
+                return int(match.group(1)) if match else 0
+            total_duration = len(pydub_load(original_file_path))
+            combined = AudioSegment.empty()
+            current_position = 0
+            for file, (start_i, end_i) in zip(sorted(files_list, key=extract_number), time_stamps):
+                if start_i > current_position: combined += AudioSegment.silent(duration=start_i - current_position)
+                combined += pydub_load(file)
+                current_position = end_i
+            if current_position < total_duration: combined += AudioSegment.silent(duration=total_duration - current_position)
+            combined.export(output_path, format=format)
+            return output_path
+        embeddings = np.zeros(shape=(len(segments), 192))
+        for i, segment in enumerate(segments):
+            embeddings[i] = segment_embedding(segment)
+        labels = AgglomerativeClustering(num_spk).fit(np.nan_to_num(embeddings)).labels_
+        for i in range(len(segments)):
+            segments[i]["speaker"] = 'SPEAKER ' + str(labels[i] + 1)
+        merged_segments, current_text = [], []
+        current_speaker, current_start = None, None
+        for i, segment in enumerate(segments):
+            speaker = segment["speaker"]
+            start_time = segment["start"]
+            text = segment["text"][1:]
+            if speaker == current_speaker:
+                current_text.append(text)
+                end_time = segment["end"]
+            else:
+                if current_speaker is not None: merged_segments.append({"speaker": current_speaker, "start": current_start, "end": end_time, "text": " ".join(current_text)})
+                current_speaker = speaker
+                current_start = start_time
+                current_text = [text]
+                end_time = segment["end"]
+        if current_speaker is not None: merged_segments.append({"speaker": current_speaker, "start": current_start, "end": end_time, "text": " ".join(current_text)})
+        gr_info(translations["whisper_done"])
+        x = ""
+        for segment in merged_segments:
+            x += f"\n{segment['speaker']} {str(time(segment['start']))} - {str(time(segment['end']))}\n"
+            x += segment["text"] + "\n"
+        logger.info(x)
+        gr_info(translations["process_audio"])
+        audio = pydub_load(input_audio)
+        output_folder = "audios_temp"
+        if os.path.exists(output_folder): shutil.rmtree(output_folder, ignore_errors=True)
+        for f in [output_folder, os.path.join(output_folder, "1"), os.path.join(output_folder, "2")]:
+            os.makedirs(f, exist_ok=True)
+        time_stamps, processed_segments = [], []
+        for i, segment in enumerate(merged_segments):
+            start_ms = int(segment["start"] * 1000)
+            end_ms = int(segment["end"] * 1000)
+            index = i + 1
+            segment_filename = os.path.join(output_folder, "1" if i % 2 == 1 else "2", f"segment_{index}.wav")
+            audio[start_ms:end_ms].export(segment_filename, format="wav")
+            processed_segments.append(os.path.join(output_folder, "1" if i % 2 == 1 else "2", f"segment_{index}_output.wav"))
+            time_stamps.append((start_ms, end_ms))
+        f0method, embedder_model = (method if method != "hybrid" else hybrid_method), (embedders if embedders != "custom" else custom_embedders)
+        gr_info(translations["process_done_start_convert"])
+        convert(pitch_1, filter_radius, index_strength_1, rms_mix_rate, protect, hop_length, f0method, os.path.join(output_folder, "1"), output_folder, model_pth_1, model_index_1, autotune, cleaner, clean_strength, "wav", embedder_model, resample_sr, False, f0_autotune_strength, checkpointing, onnx_f0_mode, embed_mode, formant_shifting, formant_qfrency_1, formant_timbre_1, "", proposal_pitch, proposal_pitch_threshold)
+        convert(pitch_2, filter_radius, index_strength_2, rms_mix_rate, protect, hop_length, f0method, os.path.join(output_folder, "2"), output_folder, model_pth_2, model_index_2, autotune, cleaner, clean_strength, "wav", embedder_model, resample_sr, False, f0_autotune_strength, checkpointing, onnx_f0_mode, embed_mode, formant_shifting, formant_qfrency_2, formant_timbre_2, "", proposal_pitch, proposal_pitch_threshold)
+        gr_info(translations["convert_success"])
+        return merge_audio(processed_segments, time_stamps, input_audio, output_audio.replace("wav", export_format), export_format)
+    except Exception as e:
+        gr_error(translations["error_occurred"].format(e=e))
+        import traceback
+        logger.debug(traceback.format_exc())
+        return None
+    finally:
+        if os.path.exists("audios_temp"): shutil.rmtree("audios_temp", ignore_errors=True)
+def convert_tts(clean, autotune, pitch, clean_strength, model, index, index_rate, input, output, format, method, hybrid_method, hop_length, embedders, custom_embedders, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, embedders_mode, proposal_pitch, proposal_pitch_threshold):
+    model_path = os.path.join(configs["weights_path"], model) if not os.path.exists(model) else model
+    if not model_path or not os.path.exists(model_path) or os.path.isdir(model_path) or not model.endswith((".pth", ".onnx")):
+        gr_warning(translations["provide_file"].format(filename=translations["model"]))
+        return None
+    if not input or not os.path.exists(input):
+        gr_warning(translations["input_not_valid"])
+        return None
+    if os.path.isdir(input):
+        input_audio = [f for f in os.listdir(input) if "tts" in f and f.lower().endswith(("wav", "mp3", "flac", "ogg", "opus", "m4a", "mp4", "aac", "alac", "wma", "aiff", "webm", "ac3"))]
+        if not input_audio:
+            gr_warning(translations["not_found_in_folder"])
+            return None
+        input = os.path.join(input, input_audio[0])
+    if not output:
+        gr_warning(translations["output_not_valid"])
+        return None
+    output = output.replace("wav", format)
+    if os.path.isdir(output): output = os.path.join(output, f"tts.{format}")
+    output_dir = os.path.dirname(output)
+    if not os.path.exists(output_dir): os.makedirs(output_dir, exist_ok=True)
+    output = process_output(output)
+    f0method = method if method != "hybrid" else hybrid_method
+    embedder_model = embedders if embedders != "custom" else custom_embedders
+    gr_info(translations["convert_vocal"])
+    convert(pitch, filter_radius, index_rate, rms_mix_rate, protect, hop_length, f0method, input, output, model_path, index, autotune, clean, clean_strength, format, embedder_model, resample_sr, split_audio, f0_autotune_strength, checkpointing, onnx_f0_mode, embedders_mode, formant_shifting, formant_qfrency, formant_timbre, f0_file, proposal_pitch, proposal_pitch_threshold)
+    gr_info(translations["convert_success"])
+    return output

main/app/core/model_utils.py ADDED Viewed

	@@ -0,0 +1,162 @@

+import os
+import sys
+import json
+import onnx
+import torch
+import datetime
+from collections import OrderedDict
+sys.path.append(os.getcwd())
+from main.app.core.ui import gr_info, gr_warning, gr_error
+from main.library.algorithm.onnx_export import onnx_exporter
+from main.app.variables import config, logger, translations, configs
+def fushion_model_pth(name, pth_1, pth_2, ratio):
+    if not name.endswith(".pth"): name = name + ".pth"
+    if not pth_1 or not os.path.exists(pth_1) or not pth_1.endswith(".pth"):
+        gr_warning(translations["provide_file"].format(filename=translations["model"] + " 1"))
+        return [translations["provide_file"].format(filename=translations["model"] + " 1"), None]
+    if not pth_2 or not os.path.exists(pth_2) or not pth_2.endswith(".pth"):
+        gr_warning(translations["provide_file"].format(filename=translations["model"] + " 2"))
+        return [translations["provide_file"].format(filename=translations["model"] + " 2"), None]
+    def extract(ckpt):
+        a = ckpt["model"]
+        opt = OrderedDict()
+        opt["weight"] = {}
+        for key in a.keys():
+            if "enc_q" in key: continue
+            opt["weight"][key] = a[key]
+        return opt
+    try:
+        ckpt1 = torch.load(pth_1, map_location="cpu", weights_only=True)
+        ckpt2 = torch.load(pth_2, map_location="cpu", weights_only=True)
+        if ckpt1["sr"] != ckpt2["sr"]:
+            gr_warning(translations["sr_not_same"])
+            return [translations["sr_not_same"], None]
+        cfg = ckpt1["config"]
+        cfg_f0 = ckpt1["f0"]
+        cfg_version = ckpt1["version"]
+        cfg_sr = ckpt1["sr"]
+        vocoder = ckpt1.get("vocoder", "Default")
+        rms_extract = ckpt1.get("energy", False)
+        ckpt1 = extract(ckpt1) if "model" in ckpt1 else ckpt1["weight"]
+        ckpt2 = extract(ckpt2) if "model" in ckpt2 else ckpt2["weight"]
+        if sorted(list(ckpt1.keys())) != sorted(list(ckpt2.keys())):
+            gr_warning(translations["architectures_not_same"])
+            return [translations["architectures_not_same"], None]
+        gr_info(translations["start"].format(start=translations["fushion_model"]))
+        opt = OrderedDict()
+        opt["weight"] = {}
+        for key in ckpt1.keys():
+            if key == "emb_g.weight" and ckpt1[key].shape != ckpt2[key].shape:
+                min_shape0 = min(ckpt1[key].shape[0], ckpt2[key].shape[0])
+                opt["weight"][key] = (ratio * (ckpt1[key][:min_shape0].float()) + (1 - ratio) * (ckpt2[key][:min_shape0].float())).half()
+            else: opt["weight"][key] = (ratio * (ckpt1[key].float()) + (1 - ratio) * (ckpt2[key].float())).half()
+        opt["config"] = cfg
+        opt["sr"] = cfg_sr
+        opt["f0"] = cfg_f0
+        opt["version"] = cfg_version
+        opt["infos"] = translations["model_fushion_info"].format(name=name, pth_1=pth_1, pth_2=pth_2, ratio=ratio)
+        opt["vocoder"] = vocoder
+        opt["energy"] = rms_extract
+        output_model = configs["weights_path"]
+        if not os.path.exists(output_model): os.makedirs(output_model, exist_ok=True)
+        torch.save(opt, os.path.join(output_model, name))
+        gr_info(translations["success"])
+        return [translations["success"], os.path.join(output_model, name)]
+    except Exception as e:
+        gr_error(message=translations["error_occurred"].format(e=e))
+        return [e, None]
+def fushion_model(name, path_1, path_2, ratio):
+    if not name:
+        gr_warning(translations["provide_name_is_save"])
+        return [translations["provide_name_is_save"], None]
+    if path_1.endswith(".pth") and path_2.endswith(".pth"): return fushion_model_pth(name.replace(".onnx", ".pth"), path_1, path_2, ratio)
+    else:
+        gr_warning(translations["format_not_valid"])
+        return [None, None]
+def onnx_export(model_path):
+    if not model_path.endswith(".pth"): model_path + ".pth"
+    if not model_path or not os.path.exists(model_path) or not model_path.endswith(".pth"): return gr_warning(translations["provide_file"].format(filename=translations["model"]))
+    try:
+        gr_info(translations["start_onnx_export"])
+        output = onnx_exporter(model_path, model_path.replace(".pth", ".onnx"), is_half=config.is_half, device=config.device)
+        gr_info(translations["success"])
+        return output
+    except Exception as e:
+        return gr_error(e)
+def model_info(path):
+    if not path or not os.path.exists(path) or os.path.isdir(path) or not path.endswith((".pth", ".onnx")): return gr_warning(translations["provide_file"].format(filename=translations["model"]))
+    def prettify_date(date_str):
+        if date_str == translations["not_found_create_time"]: return None
+        try:
+            return datetime.datetime.strptime(date_str, "%Y-%m-%dT%H:%M:%S.%f").strftime("%Y-%m-%d %H:%M:%S")
+        except ValueError as e:
+            logger.debug(e)
+            return translations["format_not_valid"]
+    if path.endswith(".pth"): model_data = torch.load(path, map_location=torch.device("cpu"))
+    else:
+        model = onnx.load(path)
+        model_data = None
+        for prop in model.metadata_props:
+            if prop.key == "model_info":
+                model_data = json.loads(prop.value)
+                break
+    gr_info(translations["read_info"])
+    epochs = model_data.get("epoch", None)
+    if epochs is None:
+        epochs = model_data.get("info", None)
+        try:
+            epoch = epochs.replace("epoch", "").replace("e", "").isdigit()
+            if epoch and epochs is None: epochs = translations["not_found"].format(name=translations["epoch"])
+        except:
+            pass
+    steps = model_data.get("step", translations["not_found"].format(name=translations["step"]))
+    sr = model_data.get("sr", translations["not_found"].format(name=translations["sr"]))
+    f0 = model_data.get("f0", translations["not_found"].format(name=translations["f0"]))
+    version = model_data.get("version", translations["not_found"].format(name=translations["version"]))
+    creation_date = model_data.get("creation_date", translations["not_found_create_time"])
+    model_hash = model_data.get("model_hash", translations["not_found"].format(name="model_hash"))
+    pitch_guidance = translations["trained_f0"] if f0 else translations["not_f0"]
+    creation_date_str = prettify_date(creation_date) if creation_date else translations["not_found_create_time"]
+    model_name = model_data.get("model_name", translations["unregistered"])
+    model_author = model_data.get("author", translations["not_author"])
+    vocoder = model_data.get("vocoder", "Default")
+    rms_extract = model_data.get("energy", False)
+    gr_info(translations["success"])
+    return translations["model_info"].format(model_name=model_name, model_author=model_author, epochs=epochs, steps=steps, version=version, sr=sr, pitch_guidance=pitch_guidance, model_hash=model_hash, creation_date_str=creation_date_str, vocoder=vocoder, rms_extract=rms_extract)

main/app/core/presets.py ADDED Viewed

	@@ -0,0 +1,165 @@

+import os
+import sys
+import json
+sys.path.append(os.getcwd())
+from main.app.variables import translations, configs
+from main.app.core.ui import gr_info, gr_warning, change_preset_choices, change_effect_preset_choices
+def load_presets(presets, cleaner, autotune, pitch, clean_strength, index_strength, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, formant_shifting, formant_qfrency, formant_timbre):
+    if not presets: gr_warning(translations["provide_file_settings"])
+    file = {}
+    if presets:
+        with open(os.path.join(configs["presets_path"], presets)) as f:
+            file = json.load(f)
+        gr_info(translations["load_presets"].format(presets=presets))
+    return [file.get("cleaner", cleaner), file.get("autotune", autotune), file.get("pitch", pitch), file.get("clean_strength", clean_strength), file.get("index_strength", index_strength), file.get("resample_sr", resample_sr), file.get("filter_radius", filter_radius), file.get("rms_mix_rate", rms_mix_rate), file.get("protect", protect), file.get("split_audio", split_audio), file.get("f0_autotune_strength", f0_autotune_strength), file.get("formant_shifting", formant_shifting), file.get("formant_qfrency", formant_qfrency), file.get("formant_timbre", formant_timbre)]
+def save_presets(name, cleaner, autotune, pitch, clean_strength, index_strength, resample_sr, filter_radius, rms_mix_rate, protect, split_audio, f0_autotune_strength, cleaner_chbox, autotune_chbox, pitch_chbox, index_strength_chbox, resample_sr_chbox, filter_radius_chbox, rms_mix_rate_chbox, protect_chbox, split_audio_chbox, formant_shifting_chbox, formant_shifting, formant_qfrency, formant_timbre):
+    if not name: return gr_warning(translations["provide_filename_settings"])
+    if not any([cleaner_chbox, autotune_chbox, pitch_chbox, index_strength_chbox, resample_sr_chbox, filter_radius_chbox, rms_mix_rate_chbox, protect_chbox, split_audio_chbox, formant_shifting_chbox]): return gr_warning(translations["choose1"])
+    settings = {}
+    for checkbox, data in [(cleaner_chbox, {"cleaner": cleaner, "clean_strength": clean_strength}), (autotune_chbox, {"autotune": autotune, "f0_autotune_strength": f0_autotune_strength}), (pitch_chbox, {"pitch": pitch}), (index_strength_chbox, {"index_strength": index_strength}), (resample_sr_chbox, {"resample_sr": resample_sr}), (filter_radius_chbox, {"filter_radius": filter_radius}), (rms_mix_rate_chbox, {"rms_mix_rate": rms_mix_rate}), (protect_chbox, {"protect": protect}), (split_audio_chbox, {"split_audio": split_audio}), (formant_shifting_chbox, {"formant_shifting": formant_shifting, "formant_qfrency": formant_qfrency, "formant_timbre": formant_timbre})]:
+        if checkbox: settings.update(data)
+    with open(os.path.join(configs["presets_path"], name + ".conversion.json"), "w") as f:
+        json.dump(settings, f, indent=4)
+    gr_info(translations["export_settings"].format(name=name))
+    return change_preset_choices()
+def audio_effect_load_presets(presets, resample_checkbox, audio_effect_resample_sr, chorus_depth, chorus_rate_hz, chorus_mix, chorus_centre_delay_ms, chorus_feedback, distortion_drive_db, reverb_room_size, reverb_damping, reverb_wet_level, reverb_dry_level, reverb_width, reverb_freeze_mode, pitch_shift_semitones, delay_second, delay_feedback, delay_mix, compressor_threshold_db, compressor_ratio, compressor_attack_ms, compressor_release_ms, limiter_threshold_db, limiter_release_ms, gain_db, bitcrush_bit_depth, clipping_threshold_db, phaser_rate_hz, phaser_depth, phaser_centre_frequency_hz, phaser_feedback, phaser_mix, bass_boost, bass_frequency, treble_boost, treble_frequency, fade_in, fade_out, chorus_check_box, distortion_checkbox, reverb_check_box, delay_check_box, compressor_check_box, limiter, gain_checkbox, bitcrush_checkbox, clipping_checkbox, phaser_check_box, bass_or_treble, fade):
+    if not presets: gr_warning(translations["provide_file_settings"])
+    file = {}
+    if presets:
+        with open(os.path.join(configs["presets_path"], presets)) as f:
+            file = json.load(f)
+        gr_info(translations["load_presets"].format(presets=presets))
+    return [
+        file.get("resample_checkbox", resample_checkbox), file.get("audio_effect_resample_sr", audio_effect_resample_sr),
+        file.get("chorus_depth", chorus_depth), file.get("chorus_rate_hz", chorus_rate_hz),
+        file.get("chorus_mix", chorus_mix), file.get("chorus_centre_delay_ms", chorus_centre_delay_ms),
+        file.get("chorus_feedback", chorus_feedback), file.get("distortion_drive_db", distortion_drive_db),
+        file.get("reverb_room_size", reverb_room_size), file.get("reverb_damping", reverb_damping),
+        file.get("reverb_wet_level", reverb_wet_level), file.get("reverb_dry_level", reverb_dry_level),
+        file.get("reverb_width", reverb_width), file.get("reverb_freeze_mode", reverb_freeze_mode),
+        file.get("pitch_shift_semitones", pitch_shift_semitones), file.get("delay_second", delay_second),
+        file.get("delay_feedback", delay_feedback), file.get("delay_mix", delay_mix),
+        file.get("compressor_threshold_db", compressor_threshold_db), file.get("compressor_ratio", compressor_ratio),
+        file.get("compressor_attack_ms", compressor_attack_ms), file.get("compressor_release_ms", compressor_release_ms),
+        file.get("limiter_threshold_db", limiter_threshold_db), file.get("limiter_release_ms", limiter_release_ms),
+        file.get("gain_db", gain_db), file.get("bitcrush_bit_depth", bitcrush_bit_depth),
+        file.get("clipping_threshold_db", clipping_threshold_db), file.get("phaser_rate_hz", phaser_rate_hz),
+        file.get("phaser_depth", phaser_depth), file.get("phaser_centre_frequency_hz", phaser_centre_frequency_hz),
+        file.get("phaser_feedback", phaser_feedback), file.get("phaser_mix", phaser_mix),
+        file.get("bass_boost", bass_boost), file.get("bass_frequency", bass_frequency),
+        file.get("treble_boost", treble_boost), file.get("treble_frequency", treble_frequency),
+        file.get("fade_in", fade_in), file.get("fade_out", fade_out),
+        file.get("chorus_check_box", chorus_check_box), file.get("distortion_checkbox", distortion_checkbox),
+        file.get("reverb_check_box", reverb_check_box), file.get("delay_check_box", delay_check_box),
+        file.get("compressor_check_box", compressor_check_box), file.get("limiter", limiter),
+        file.get("gain_checkbox", gain_checkbox), file.get("bitcrush_checkbox", bitcrush_checkbox),
+        file.get("clipping_checkbox", clipping_checkbox), file.get("phaser_check_box", phaser_check_box),
+        file.get("bass_or_treble", bass_or_treble), file.get("fade", fade)
+    ]
+def audio_effect_save_presets(name, resample_checkbox, audio_effect_resample_sr, chorus_depth, chorus_rate_hz, chorus_mix, chorus_centre_delay_ms, chorus_feedback, distortion_drive_db, reverb_room_size, reverb_damping, reverb_wet_level, reverb_dry_level, reverb_width, reverb_freeze_mode, pitch_shift_semitones, delay_second, delay_feedback, delay_mix, compressor_threshold_db, compressor_ratio, compressor_attack_ms, compressor_release_ms, limiter_threshold_db, limiter_release_ms, gain_db, bitcrush_bit_depth, clipping_threshold_db, phaser_rate_hz, phaser_depth, phaser_centre_frequency_hz, phaser_feedback, phaser_mix, bass_boost, bass_frequency, treble_boost, treble_frequency, fade_in, fade_out, chorus_check_box, distortion_checkbox, reverb_check_box, delay_check_box, compressor_check_box, limiter, gain_checkbox, bitcrush_checkbox, clipping_checkbox, phaser_check_box, bass_or_treble, fade):
+    if not name: return gr_warning(translations["provide_filename_settings"])
+    if not any([resample_checkbox, chorus_check_box, distortion_checkbox, reverb_check_box, delay_check_box, compressor_check_box, limiter, gain_checkbox, bitcrush_checkbox, clipping_checkbox, phaser_check_box, bass_or_treble, fade, pitch_shift_semitones != 0]): return gr_warning(translations["choose1"])
+    settings = {}
+    for checkbox, data in [
+        (resample_checkbox, {
+            "resample_checkbox": resample_checkbox,
+            "audio_effect_resample_sr": audio_effect_resample_sr
+        }),
+        (chorus_check_box, {
+            "chorus_check_box": chorus_check_box,
+            "chorus_depth": chorus_depth,
+            "chorus_rate_hz": chorus_rate_hz,
+            "chorus_mix": chorus_mix,
+            "chorus_centre_delay_ms": chorus_centre_delay_ms,
+            "chorus_feedback": chorus_feedback
+        }),
+        (distortion_checkbox, {
+            "distortion_checkbox": distortion_checkbox,
+            "distortion_drive_db": distortion_drive_db
+        }),
+        (reverb_check_box, {
+            "reverb_check_box": reverb_check_box,
+            "reverb_room_size": reverb_room_size,
+            "reverb_damping": reverb_damping,
+            "reverb_wet_level": reverb_wet_level,
+            "reverb_dry_level": reverb_dry_level,
+            "reverb_width": reverb_width,
+            "reverb_freeze_mode": reverb_freeze_mode
+        }),
+        (pitch_shift_semitones != 0, {
+            "pitch_shift_semitones": pitch_shift_semitones
+        }),
+        (delay_check_box, {
+            "delay_check_box": delay_check_box,
+            "delay_second": delay_second,
+            "delay_feedback": delay_feedback,
+            "delay_mix": delay_mix
+        }),
+        (compressor_check_box, {
+            "compressor_check_box": compressor_check_box,
+            "compressor_threshold_db": compressor_threshold_db,
+            "compressor_ratio": compressor_ratio,
+            "compressor_attack_ms": compressor_attack_ms,
+            "compressor_release_ms": compressor_release_ms
+        }),
+        (limiter, {
+            "limiter": limiter,
+            "limiter_threshold_db": limiter_threshold_db,
+            "limiter_release_ms": limiter_release_ms
+        }),
+        (gain_checkbox, {
+            "gain_checkbox": gain_checkbox,
+            "gain_db": gain_db
+        }),
+        (bitcrush_checkbox, {
+            "bitcrush_checkbox": bitcrush_checkbox,
+            "bitcrush_bit_depth": bitcrush_bit_depth
+        }),
+        (clipping_checkbox, {
+            "clipping_checkbox": clipping_checkbox,
+            "clipping_threshold_db": clipping_threshold_db
+        }),
+        (phaser_check_box, {
+            "phaser_check_box": phaser_check_box,
+            "phaser_rate_hz": phaser_rate_hz,
+            "phaser_depth": phaser_depth,
+            "phaser_centre_frequency_hz": phaser_centre_frequency_hz,
+            "phaser_feedback": phaser_feedback,
+            "phaser_mix": phaser_mix
+        }),
+        (bass_or_treble, {
+            "bass_or_treble": bass_or_treble,
+            "bass_boost": bass_boost,
+            "bass_frequency": bass_frequency,
+            "treble_boost": treble_boost,
+            "treble_frequency": treble_frequency
+        }),
+        (fade, {
+            "fade": fade,
+            "fade_in": fade_in,
+            "fade_out": fade_out
+        })
+    ]:
+        if checkbox: settings.update(data)
+    with open(os.path.join(configs["presets_path"], name + ".effect.json"), "w") as f:
+        json.dump(settings, f, indent=4)
+    gr_info(translations["export_settings"].format(name=name))
+    return change_effect_preset_choices()