File size: 2,426 Bytes
d235fa9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7f7006b
d235fa9
7f7006b
d235fa9
 
 
7f7006b
d235fa9
 
 
 
 
 
7f7006b
d235fa9
 
 
 
 
 
 
 
 
7f7006b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d235fa9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# VietVoices RunPod Serverless Deployment

This folder contains all the necessary files to deploy VietVoices TTS on RunPod Serverless.

## Setup Instructions

### 1. Prerequisites

- RunPod account with API key
- Docker Hub account
- Hugging Face token (for model access)

### 2. Environment Variables

Set these environment variables:

```bash
export RUNPOD_API_KEY="your_runpod_api_key"
export HUGGINGFACEHUB_API_TOKEN="your_hf_token"
```

### 3. Build and Push Docker Image

```bash
# Make the script executable
chmod +x build_and_push.sh

# Update DOCKER_USERNAME in the script
# Then run:
./build_and_push.sh
```

### 4. Deploy to RunPod

```bash
python deploy.py
```

## API Usage

### Non-Streaming Mode (Default)

**Request Format:**
```json
{
  "input": {
    "ref_audio": "https://s3.amazonaws.com/path/to/audio.wav",
    "gen_text": "Text to convert to speech",
    "speed": 1.0
  }
}
```

**Response Format:**
```json
{
  "audio_base64": "base64_encoded_output_audio",
  "sample_rate": 24000,
  "spectrogram_base64": "base64_encoded_spectrogram",
  "status": "success"
}
```

### Streaming Mode

Enable streaming by setting `"stream": true` to receive audio chunks as they're generated:

**Request Format:**
```json
{
  "input": {
    "ref_audio": "https://s3.amazonaws.com/path/to/audio.wav",
    "gen_text": "Text to convert to speech",
    "speed": 1.0,
    "stream": true
  }
}
```

**Streaming Response (each chunk):**
```json
{
  "chunk_index": 0,
  "total_chunks": 5,
  "progress": 20.0,
  "audio_chunk_base64": "base64_wav_chunk",
  "sample_rate": 24000,
  "status": "processing",
  "text_batch": "Text portion for this chunk"
}
```

**How to use streaming:**
1. Submit job to `/run` endpoint (NOT `/runsync`) with `"stream": true`
2. Get the `job_id` from the response
3. Poll `/stream/{job_id}` endpoint every 1-2 seconds
4. Process chunks from `stream_data["stream"]` array
5. Stop when `status == "COMPLETED"`

**Important:** Streaming requires the async `/run` endpoint. The `/runsync` endpoint does not support streaming.

### Error Response

```json
{
  "error": "Error message"
}
```

## Cost Optimization

- The endpoint uses cold starts (workers_min: 0) to minimize costs
- Scales up to 1 worker when requests come in
- Uses RTX 3090 GPUs for optimal performance/cost ratio

## Monitoring

Check your RunPod dashboard for:

- Request logs
- Performance metrics
- Cost tracking
- Error monitoring