priteshmistry commited on
Commit
91d5d27
·
verified ·
1 Parent(s): 8073d84

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -530
README.md CHANGED
@@ -1,530 +1,8 @@
1
- # 📚 ebook2audiobook
2
- CPU/GPU Converter from eBooks to audiobooks with chapters and metadata<br/>
3
- using XTTSv2, Bark, Vits, Fairseq, YourTTS, Tacotron and more. Supports voice cloning and +1110 languages!
4
- > [!IMPORTANT]
5
- **This tool is intended for use with non-DRM, legally acquired eBooks only.** <br>
6
- The authors are not responsible for any misuse of this software or any resulting legal consequences. <br>
7
- Use this tool responsibly and in accordance with all applicable laws.
8
-
9
- [![Discord](https://dcbadge.limes.pink/api/server/https://discord.gg/63Tv3F65k6)](https://discord.gg/63Tv3F65k6)
10
-
11
- ### Thanks to support ebook2audiobook developers!
12
- [![Ko-Fi](https://img.shields.io/badge/Ko--fi-F16061?style=for-the-badge&logo=ko-fi&logoColor=white)](https://ko-fi.com/athomasson2)
13
-
14
- ### Run locally
15
-
16
- [![Quick Start](https://img.shields.io/badge/Quick%20Start-blue?style=for-the-badge)](#launching-gradio-web-interface)
17
-
18
- [![Docker Build](https://github.com/DrewThomasson/ebook2audiobook/actions/workflows/Docker-Build.yml/badge.svg)](https://github.com/DrewThomasson/ebook2audiobook/actions/workflows/Docker-Build.yml) [![Download](https://img.shields.io/badge/Download-Now-blue.svg)](https://github.com/DrewThomasson/ebook2audiobook/releases/latest)
19
-
20
-
21
- <a href="https://github.com/DrewThomasson/ebook2audiobook">
22
- <img src="https://img.shields.io/badge/Platform-mac%20|%20linux%20|%20windows-lightgrey" alt="Platform">
23
- </a><a href="https://hub.docker.com/r/athomasson2/ebook2audiobook">
24
- <img alt="Docker Pull Count" src="https://img.shields.io/docker/pulls/athomasson2/ebook2audiobook.svg"/>
25
- </a>
26
-
27
- ### Run Remotely
28
- [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow?style=flat&logo=huggingface)](https://huggingface.co/spaces/drewThomasson/ebook2audiobook)
29
- [![Free Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DrewThomasson/ebook2audiobook/blob/main/Notebooks/colab_ebook2audiobook.ipynb) [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=flat&logo=kaggle&logoColor=white)](https://github.com/Rihcus/ebook2audiobookXTTS/blob/main/Notebooks/kaggle-ebook2audiobook.ipynb)
30
-
31
- #### GUI Interface
32
- ![demo_web_gui](assets/demo_web_gui.gif)
33
-
34
- <details>
35
- <summary>Click to see images of Web GUI</summary>
36
- <img width="1728" alt="GUI Screen 1" src="assets/gui_1.png">
37
- <img width="1728" alt="GUI Screen 2" src="assets/gui_2.png">
38
- <img width="1728" alt="GUI Screen 3" src="assets/gui_3.png">
39
- </details>
40
-
41
- ## Demos
42
-
43
- **New Default Voice Demo**
44
-
45
- https://github.com/user-attachments/assets/750035dc-e355-46f1-9286-05c1d9e88cea
46
-
47
- <details>
48
- <summary>More Demos</summary>
49
-
50
- **ASMR Voice**
51
-
52
- https://github.com/user-attachments/assets/68eee9a1-6f71-4903-aacd-47397e47e422
53
-
54
- **Rainy Day Voice**
55
-
56
- https://github.com/user-attachments/assets/d25034d9-c77f-43a9-8f14-0d167172b080
57
-
58
- **Scarlett Voice**
59
-
60
- https://github.com/user-attachments/assets/b12009ee-ec0d-45ce-a1ef-b3a52b9f8693
61
-
62
- **David Attenborough Voice**
63
-
64
- https://github.com/user-attachments/assets/81c4baad-117e-4db5-ac86-efc2b7fea921
65
-
66
- **Example**
67
-
68
- ![Example](https://github.com/DrewThomasson/VoxNovel/blob/dc5197dff97252fa44c391dc0596902d71278a88/readme_files/example_in_app.jpeg)
69
- </details>
70
-
71
- ## README.md
72
-
73
- ## Table of Contents
74
- - [ebook2audiobook](#-ebook2audiobook)
75
- - [Features](#features)
76
- - [GUI Interface](#gui-interface)
77
- - [Demos](#demos)
78
- - [Supported Languages](#supported-languages)
79
- - [Minimum Requirements](#hardware-requirements)
80
- - [Usage](#launching-gradio-web-interface)
81
- - [Run Locally](#launching-gradio-web-interface)
82
- - [Launching Gradio Web Interface](#launching-gradio-web-interface)
83
- - [Basic Headless Usage](#basic--usage)
84
- - [Headless Custom XTTS Model Usage](#example-of-custom-model-zip-upload)
85
- - [Help command output](#help-command-output)
86
- - [Run Remotely](#run-remotely)
87
- - [Fine Tuned TTS models](#fine-tuned-tts-models)
88
- - [Collection of Fine-Tuned TTS Models](#fine-tuned-tts-collection)
89
- - [Train XTTSv2](#fine-tune-your-own-xttsv2-model)
90
- - [Docker](#docker-gpu-options)
91
- - [GPU options](#docker-gpu-options)
92
- - [Docker Run](#running-the-pre-built-docker-container)
93
- - [Docker Build](#building-the-docker-container)
94
- - [Docker Compose](#docker-compose)
95
- - [Docker headless guide](#docker-headless-guide)
96
- - [Docker container file locations](#docker-container-file-locations)
97
- - [Common Docker issues](#common-docker-issues)
98
- - [Supported eBook Formats](#supported-ebook-formats)
99
- - [Output Formats](#output-formats)
100
- - [Updating to Latest Version](#updating-to-latest-version)
101
- - [Revert to older Version](#reverting-to-older-versions)
102
- - [Common Issues](#common-issues)
103
- - [Special Thanks](#special-thanks)
104
- - [Table of Contents](#table-of-contents)
105
-
106
-
107
- ## Features
108
- - 📚 Splits eBook into chapters for organized audio.
109
- - 🎙️ High-quality text-to-speech with [Coqui XTTSv2](https://huggingface.co/coqui/XTTS-v2) and [Fairseq](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) (and more).
110
- - 🗣️ Optional voice cloning with your own voice file.
111
- - 🌍 Supports +1110 languages (English by default). [List of Supported languages](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
112
- - 🖥️ Designed to run on 4GB RAM.
113
-
114
-
115
- ## Supported Languages
116
- | **Arabic (ar)** | **Chinese (zh)** | **English (en)** | **Spanish (es)** |
117
- |:------------------:|:------------------:|:------------------:|:------------------:|
118
- | **French (fr)** | **German (de)** | **Italian (it)** | **Portuguese (pt)** |
119
- | **Polish (pl)** | **Turkish (tr)** | **Russian (ru)** | **Dutch (nl)** |
120
- | **Czech (cs)** | **Japanese (ja)** | **Hindi (hi)** | **Bengali (bn)** |
121
- | **Hungarian (hu)** | **Korean (ko)** | **Vietnamese (vi)**| **Swedish (sv)** |
122
- | **Persian (fa)** | **Yoruba (yo)** | **Swahili (sw)** | **Indonesian (id)**|
123
- | **Slovak (sk)** | **Croatian (hr)** | **Tamil (ta)** | **Danish (da)** |
124
- - [**+1100 languages and dialects here**](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
125
-
126
-
127
- ## Hardware Requirements
128
- - 4gb RAM minimum, 8GB recommended
129
- - Virtualization enabled if running on windows (Docker only)
130
- - CPU (intel, AMD, ARM), GPU (Nvidia, AMD*, Intel*) (Recommended), MPS (Apple Silicon CPU)
131
- *available very soon
132
-
133
- > [!IMPORTANT]
134
- **Before to post an install or bug issue search carefully to the opened and closed issues TAB<br>
135
- to be sure your issue does not exist already.**
136
-
137
-
138
- >[!NOTE]
139
- **Lacking of any standards structure like what is a chapter, paragraph, preface etc.<br>
140
- you should first remove manually any text you don't want to be converted in audio.**
141
-
142
- ### Installation Instructions
143
- 1. **Clone repo**
144
- ```bash
145
- git clone https://github.com/DrewThomasson/ebook2audiobook.git
146
- cd ebook2audiobook
147
- ```
148
-
149
- ### Launching Gradio Web Interface
150
- 1. **Run ebook2audiobook**:
151
- - **Linux/MacOS**
152
- ```bash
153
- ./ebook2audiobook.sh # Run launch script
154
- ```
155
-
156
- - **Mac Launcher**
157
- Double click `Mac Ebook2Audiobook Launcher.command`
158
-
159
-
160
- - **Windows**
161
- ```bash
162
- ebook2audiobook.cmd # Run launch script or double click on it
163
- ```
164
-
165
- - **Windows Launcher**
166
- Double click `ebook2audiobook.cmd`
167
-
168
-
169
- - **Manual Python Install**
170
- ```bash
171
- # (for experts only!)
172
- REQUIRED_PROGRAMS=("calibre" "ffmpeg" "nodejs" "mecab" "espeak-ng" "rust" "sox")
173
- REQUIRED_PYTHON_VERSION="3.12"
174
- pip install -r requirements.txt # Install Python Requirements
175
- python app.py # Run Ebook2Audiobook
176
- ```
177
-
178
- 1. **Open the Web App**: Click the URL provided in the terminal to access the web app and convert eBooks. `http://localhost:7860/`
179
- 2. **For Public Link**:
180
- `python app.py --share` (all OS)
181
- `./ebook2audiobook.sh --share` (Linux/MacOS)
182
- `ebook2audiobook.cmd --share` (Windows)
183
-
184
- > [!IMPORTANT]
185
- **If the script is stopped and run again, you need to refresh your gradio GUI interface<br>
186
- to let the web page reconnect to the new connection socket.**
187
-
188
- ### Basic Usage
189
- - **Linux/MacOS**:
190
- ```bash
191
- ./ebook2audiobook.sh --headless --ebook <path_to_ebook_file> \
192
- --voice [path_to_voice_file] --language [language_code]
193
- ```
194
- - **Windows**
195
- ```bash
196
- ebook2audiobook.cmd --headless --ebook <path_to_ebook_file>
197
- --voice [path_to_voice_file] --language [language_code]
198
- ```
199
-
200
- - **[--ebook]**: Path to your eBook file
201
- - **[--voice]**: Voice cloning file path (optional)
202
- - **[--language]**: Language code in ISO-639-3 (i.e.: ita for italian, eng for english, deu for german...).<br>
203
- Default language is eng and --language is optional for default language set in ./lib/lang.py.<br>
204
- The ISO-639-1 2 letters codes are also supported.
205
-
206
-
207
- ### Example of Custom Model Zip Upload
208
- (must be a .zip file containing the mandatory model files. Example for XTTSv2: config.json, model.pth, vocab.json and ref.wav)
209
- - **Linux/MacOS**
210
- ```bash
211
- ./ebook2audiobook.sh --headless --ebook <ebook_file_path> \
212
- --voice <target_voice_file_path> --language <language> --custom_model <custom_model_path>
213
- ```
214
- - **Windows**
215
- ```bash
216
- ebook2audiobook.cmd --headless --ebook <ebook_file_path> \
217
- --voice <target_voice_file_path> --language <language> --custom_model <custom_model_path>
218
- ```
219
- - **<custom_model_path>**: Path to `model_name.zip` file,
220
- which must contain (according to the tts engine) all the mandatory files<br>
221
- (see ./lib/models.py).
222
-
223
-
224
- ### For Detailed Guide with list of all Parameters to use
225
- - **Linux/MacOS**
226
- ```bash
227
- ./ebook2audiobook.sh --help
228
- ```
229
- - **Windows**
230
- ```bash
231
- ebook2audiobook.cmd --help
232
- ```
233
- - **Or for all OS**
234
- ```python
235
- app.py --help
236
- ```
237
-
238
- <a id="help-command-output"></a>
239
- ```bash
240
- usage: app.py [-h] [--session SESSION] [--share] [--headless] [--ebook EBOOK]
241
- [--ebooks_dir EBOOKS_DIR] [--language LANGUAGE] [--voice VOICE]
242
- [--device {cpu,gpu,mps}]
243
- [--tts_engine {XTTSv2,BARK,VITS,FAIRSEQ,TACOTRON2,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}]
244
- [--custom_model CUSTOM_MODEL] [--fine_tuned FINE_TUNED]
245
- [--output_format OUTPUT_FORMAT] [--temperature TEMPERATURE]
246
- [--length_penalty LENGTH_PENALTY] [--num_beams NUM_BEAMS]
247
- [--repetition_penalty REPETITION_PENALTY] [--top_k TOP_K]
248
- [--top_p TOP_P] [--speed SPEED] [--enable_text_splitting]
249
- [--text_temp TEXT_TEMP] [--waveform_temp WAVEFORM_TEMP]
250
- [--output_dir OUTPUT_DIR] [--version]
251
-
252
- Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the Gradio interface or run the script in headless mode for direct conversion.
253
-
254
- options:
255
- -h, --help show this help message and exit
256
- --session SESSION Session to resume the conversion in case of interruption, crash,
257
- or reuse of custom models and custom cloning voices.
258
-
259
- **** The following options are for all modes:
260
- Optional
261
-
262
- **** The following option are for gradio/gui mode only:
263
- Optional
264
-
265
- --share Enable a public shareable Gradio link.
266
-
267
- **** The following options are for --headless mode only:
268
- --headless Run the script in headless mode
269
- --ebook EBOOK Path to the ebook file for conversion. Cannot be used when --ebooks_dir is present.
270
- --ebooks_dir EBOOKS_DIR
271
- Relative or absolute path of the directory containing the files to convert.
272
- Cannot be used when --ebook is present.
273
- --language LANGUAGE Language of the e-book. Default language is set
274
- in ./lib/lang.py sed as default if not present. All compatible language codes are in ./lib/lang.py
275
-
276
- optional parameters:
277
- --voice VOICE (Optional) Path to the voice cloning file for TTS engine.
278
- Uses the default voice if not present.
279
- --device {cpu,gpu,mps}
280
- (Optional) Pprocessor unit type for the conversion.
281
- Default is set in ./lib/conf.py if not present. Fall back to CPU if GPU not available.
282
- --tts_engine {XTTSv2,BARK,VITS,FAIRSEQ,TACOTRON2,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}
283
- (Optional) Preferred TTS engine (available are: ['XTTSv2', 'BARK', 'VITS', 'FAIRSEQ', 'TACOTRON2', 'YOURTTS', 'xtts', 'bark', 'vits', 'fairseq', 'tacotron', 'yourtts'].
284
- Default depends on the selected language. The tts engine should be compatible with the chosen language
285
- --custom_model CUSTOM_MODEL
286
- (Optional) Path to the custom model zip file cntaining mandatory model files.
287
- Please refer to ./lib/models.py
288
- --fine_tuned FINE_TUNED
289
- (Optional) Fine tuned model path. Default is builtin model.
290
- --output_format OUTPUT_FORMAT
291
- (Optional) Output audio format. Default is set in ./lib/conf.py
292
- --temperature TEMPERATURE
293
- (xtts only, optional) Temperature for the model.
294
- Default to config.json model. Higher temperatures lead to more creative outputs.
295
- --length_penalty LENGTH_PENALTY
296
- (xtts only, optional) A length penalty applied to the autoregressive decoder.
297
- Default to config.json model. Not applied to custom models.
298
- --num_beams NUM_BEAMS
299
- (xtts only, optional) Controls how many alternative sequences the model explores. Must be equal or greater than length penalty.
300
- Default to config.json model.
301
- --repetition_penalty REPETITION_PENALTY
302
- (xtts only, optional) A penalty that prevents the autoregressive decoder from repeating itself.
303
- Default to config.json model.
304
- --top_k TOP_K (xtts only, optional) Top-k sampling.
305
- Lower values mean more likely outputs and increased audio generation speed.
306
- Default to config.json model.
307
- --top_p TOP_P (xtts only, optional) Top-p sampling.
308
- Lower values mean more likely outputs and increased audio generation speed. Default to config.json model.
309
- --speed SPEED (xtts only, optional) Speed factor for the speech generation.
310
- Default to config.json model.
311
- --enable_text_splitting
312
- (xtts only, optional) Enable TTS text splitting. This option is known to not be very efficient.
313
- Default to config.json model.
314
- --text_temp TEXT_TEMP
315
- (bark only, optional) Text Temperature for the model.
316
- Default to 0.85. Higher temperatures lead to more creative outputs.
317
- --waveform_temp WAVEFORM_TEMP
318
- (bark only, optional) Waveform Temperature for the model.
319
- Default to 0.5. Higher temperatures lead to more creative outputs.
320
- --output_dir OUTPUT_DIR
321
- (Optional) Path to the output directory. Default is set in ./lib/conf.py
322
- --version Show the version of the script and exit
323
-
324
- Example usage:
325
- Windows:
326
- Gradio/GUI:
327
- ebook2audiobook.cmd
328
- Headless mode:
329
- ebook2audiobook.cmd --headless --ebook '/path/to/file'
330
- Linux/Mac:
331
- Gradio/GUI:
332
- ./ebook2audiobook.sh
333
- Headless mode:
334
- ./ebook2audiobook.sh --headless --ebook '/path/to/file'
335
-
336
- Tip: to add of silence (1.4 seconds) into your text just use "###" or "[pause]".
337
-
338
- ```
339
-
340
- NOTE: in gradio/gui mode, to cancel a running conversion, just click on the [X] from the ebook upload component.
341
-
342
- TIP: if it needs some more pauses, just add '###' or '[pause]' between the words you wish more pause. one [pause] equals to 1.4 seconds
343
-
344
- #### Docker GPU Options
345
-
346
- Available pre-build tags: `latest` (CUDA 11.8)
347
- #### Edit: IF GPU isn't detected then you'll have to build the image -> [Building the Docker Container](#building-the-docker-container)
348
-
349
-
350
-
351
- #### Running the pre-built Docker Container
352
-
353
- -Run with CPU only
354
- ```powershell
355
- docker run --pull always --rm -p 7860:7860 athomasson2/ebook2audiobook
356
- ```
357
- -Run with GPU Speedup (NVIDIA compatible only)
358
- ```powershell
359
- docker run --pull always --rm --gpus all -p 7860:7860 athomasson2/ebook2audiobook
360
- ```
361
-
362
- This command will start the Gradio interface on port 7860.(localhost:7860)
363
- - For more options add the parameter `--help`
364
-
365
-
366
- #### Building the Docker Container
367
- - You can build the docker image with the command:
368
- ```powershell
369
- docker build -t athomasson2/ebook2audiobook .
370
- ```
371
- #### Avalible Docker Build Arguments
372
-
373
- `--build-arg TORCH_VERSION=cuda118` Available tags: [cuda121, cuda118, cuda128, rocm, xpu, cpu]
374
-
375
- All CUDA version numbers should work, Ex: CUDA 11.6-> cuda116
376
-
377
- `--build-arg SKIP_XTTS_TEST=true` (Saves space by not baking XTTSv2 model into docker image)
378
-
379
-
380
- ## Docker container file locations
381
- All ebook2audiobooks will have the base dir of `/app/`
382
- For example:
383
- `tmp` = `/app/tmp`
384
- `audiobooks` = `/app/audiobooks`
385
-
386
-
387
- ## Docker headless guide
388
-
389
- - Before you do run this you need to create a dir named "input-folder" in your current dir
390
- which will be linked, This is where you can put your input files for the docker image to see
391
- ```bash
392
- mkdir input-folder && mkdir Audiobooks
393
- ```
394
- - In the command below swap out **YOUR_INPUT_FILE.TXT** with the name of your input file
395
- ```bash
396
- docker run --pull always --rm \
397
- -v $(pwd)/input-folder:/app/input_folder \
398
- -v $(pwd)/audiobooks:/app/audiobooks \
399
- athomasson2/ebook2audiobook \
400
- --headless --ebook /input_folder/YOUR_EBOOK_FILE
401
- ```
402
- - The output Audiobooks will be found in the Audiobook folder which will also be located
403
- in your local dir you ran this docker command in
404
-
405
-
406
- ## To get the help command for the other parameters this program has you can run this
407
-
408
- ```bash
409
- docker run --pull always --rm athomasson2/ebook2audiobook --help
410
-
411
- ```
412
- That will output this
413
- [Help command output](#help-command-output)
414
-
415
-
416
- ### Docker Compose
417
- This project uses Docker Compose to run locally. You can enable or disable GPU support
418
- by setting either `*gpu-enabled` or `*gpu-disabled` in `docker-compose.yml`
419
-
420
-
421
- #### Steps to Run
422
- 1. **Clone the Repository** (if you haven't already):
423
- ```bash
424
- git clone https://github.com/DrewThomasson/ebook2audiobook.git
425
- cd ebook2audiobook
426
- ```
427
- 2. **Set GPU Support (disabled by default)**
428
- To enable GPU support, modify `docker-compose.yml` and change `*gpu-disabled` to `*gpu-enabled`
429
- 3. **Start the service:**
430
- ```bash
431
- # Docker
432
- docker-compose up -d # To update add --build
433
-
434
- # Podman
435
- podman compose -f podman-compose.yml up -d # To update add --build
436
- ```
437
- 4. **Access the service:**
438
- The service will be available at http://localhost:7860.
439
-
440
-
441
- ## Common Docker Issues
442
-
443
- - My NVIDIA GPU isnt being detected?? -> [GPU ISSUES Wiki Page](https://github.com/DrewThomasson/ebook2audiobook/wiki/GPU-ISSUES)
444
-
445
- - `python: can't open file '/home/user/app/app.py': [Errno 2] No such file or directory` (Just remove all post arguments as I replaced the `CMD` with `ENTRYPOINT` in the [Dockerfile](Dockerfile))
446
- - Example: `docker run --pull always athomasson2/ebook2audiobook app.py --script_mode full_docker` - > corrected - > `docker run --pull always athomasson2/ebook2audiobook`
447
- - Arguments can be easily added like this now `docker run --pull always athomasson2/ebook2audiobook --share`
448
-
449
- - Docker gets stuck downloading Fine-Tuned models.
450
- (This does not happen for every computer but some appear to run into this issue)
451
- Disabling the progress bar appears to fix the issue,
452
- as discussed [here in #191](https://github.com/DrewThomasson/ebook2audiobook/issues/191)
453
- Example of adding this fix in the `docker run` command
454
- ```Dockerfile
455
- docker run --pull always --rm --gpus all -e HF_HUB_DISABLE_PROGRESS_BARS=1 -e HF_HUB_ENABLE_HF_TRANSFER=0 \
456
- -p 7860:7860 athomasson2/ebook2audiobook
457
- ```
458
-
459
-
460
- ## Fine Tuned TTS models
461
- #### Fine Tune your own XTTSv2 model
462
-
463
- [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow?style=flat&logo=huggingface)](https://huggingface.co/spaces/drewThomasson/xtts-finetune-webui-gpu) [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=flat&logo=kaggle&logoColor=white)](https://github.com/DrewThomasson/ebook2audiobook/blob/v25/Notebooks/finetune/xtts/kaggle-xtts-finetune-webui-gradio-gui.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DrewThomasson/ebook2audiobook/blob/v25/Notebooks/finetune/xtts/colab_xtts_finetune_webui.ipynb)
464
-
465
-
466
-
467
-
468
-
469
- #### De-noise training data
470
-
471
- [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow?style=flat&logo=huggingface)](https://huggingface.co/spaces/drewThomasson/DeepFilterNet2_no_limit) [![GitHub Repo](https://img.shields.io/badge/DeepFilterNet-181717?logo=github)](https://github.com/Rikorose/DeepFilterNet)
472
-
473
-
474
- ### Fine Tuned TTS Collection
475
-
476
- [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Models-yellow?style=flat&logo=huggingface)](https://huggingface.co/drewThomasson/fineTunedTTSModels/tree/main)
477
-
478
- For an XTTSv2 custom model a ref audio clip of the voice reference is mandatory:
479
-
480
-
481
- ## Supported eBook Formats
482
- - `.epub`, `.pdf`, `.mobi`, `.txt`, `.html`, `.rtf`, `.chm`, `.lit`,
483
- `.pdb`, `.fb2`, `.odt`, `.cbr`, `.cbz`, `.prc`, `.lrf`, `.pml`,
484
- `.snb`, `.cbc`, `.rb`, `.tcr`
485
- - **Best results**: `.epub` or `.mobi` for automatic chapter detection
486
-
487
-
488
- ## Output Formats
489
- - Creates a `['m4b', 'm4a', 'mp4', 'webm', 'mov', 'mp3', 'flac', 'wav', 'ogg', 'aac']` (set in ./lib/conf.py) file with metadata and chapters.
490
-
491
- ## Updating to Latest Version
492
- ```bash
493
- git pull # Locally/Compose
494
-
495
- docker pull athomasson2/ebook2audiobook:latest # For Pre-build docker images
496
- ```
497
-
498
- ## Reverting to older Versions
499
- Releases can be found -> [here](https://github.com/DrewThomasson/ebook2audiobook/releases)
500
- ```bash
501
- git checkout tags/VERSION_NUM # Locally/Compose -> Example: git checkout tags/v25.7.7
502
-
503
- athomasson2/ebook2audiobook:VERSION_NUM # For Pre-build docker images -> Example: athomasson2/ebook2audiobook:v25.7.7
504
- ```
505
-
506
- ## Common Issues:
507
- - My NVIDIA GPU isnt being detected?? -> [GPU ISSUES Wiki Page](https://github.com/DrewThomasson/ebook2audiobook/wiki/GPU-ISSUES)
508
- - CPU is slow (better on server smp CPU) while NVIDIA GPU can have almost real time conversion.
509
- [Discussion about this](https://github.com/DrewThomasson/ebook2audiobook/discussions/19#discussioncomment-10879846)
510
- For faster multilingual generation I would suggest my other
511
- [project that uses piper-tts](https://github.com/DrewThomasson/ebook2audiobookpiper-tts) instead
512
- (It doesn't have zero-shot voice cloning though, and is Siri quality voices, but it is much faster on cpu).
513
- - "I'm having dependency issues" - Just use the docker, its fully self contained and has a headless mode,
514
- add `--help` parameter at the end of the docker run command for more information.
515
- - "Im getting a truncated audio issue!" - PLEASE MAKE AN ISSUE OF THIS,
516
- we don't speak every language and need advise from users to fine tune the sentence splitting logic.😊
517
-
518
-
519
- ## What we need help with! 🙌
520
- ## [Full list of things can be found here](https://github.com/DrewThomasson/ebook2audiobook/issues/32)
521
- - Any help from people speaking any of the supported languages to help us improve the models
522
-
523
- ## Do you need to rent a GPU to boost service from us?
524
- - A poll is open here https://github.com/DrewThomasson/ebook2audiobook/discussions/889
525
-
526
- ## Special Thanks
527
- - **Coqui TTS**: [Coqui TTS GitHub](https://github.com/idiap/coqui-ai-TTS)
528
- - **Calibre**: [Calibre Website](https://calibre-ebook.com)
529
- - **FFmpeg**: [FFmpeg Website](https://ffmpeg.org)
530
- - [@shakenbake15 for better chapter saving method](https://github.com/DrewThomasson/ebook2audiobook/issues/8)
 
1
+ ---
2
+ title: ebook2audiobook (Docker)
3
+ emoji: 🎧
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ pinned: false
8
+ ---