KurtDu commited on
Commit
62ae11f
·
verified ·
1 Parent(s): 065f64f

Update templates/index.html

Browse files
Files changed (1) hide show
  1. templates/index.html +61 -42
templates/index.html CHANGED
@@ -6,132 +6,151 @@
6
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
7
  <title>Speech-to-Speech Model Comparison</title>
8
  <link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet">
 
9
  <style>
10
  body {
11
- background-color: #f4f6f9;
12
  font-family: 'Arial', sans-serif;
13
  }
14
-
15
  .container {
16
- background-color: white;
17
- border-radius: 10px;
18
- box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
19
- padding: 30px;
 
 
20
  }
21
-
22
  h3 {
23
- font-size: 1.5rem;
24
  font-weight: bold;
25
  color: #333;
26
  text-align: center;
27
  margin-bottom: 20px;
28
  }
29
-
30
  p {
31
  color: #555;
32
  font-size: 1rem;
33
- line-height: 1.6;
34
  }
35
-
36
  .btn {
37
  border-radius: 25px;
38
- font-size: 1rem;
39
- padding: 12px 20px;
40
  font-weight: bold;
41
  transition: background-color 0.3s ease, transform 0.2s ease;
42
  }
43
-
44
  .btn-primary {
45
  background-color: #007bff;
46
  border: none;
47
  }
48
-
49
  .btn-primary:hover {
50
  background-color: #0056b3;
51
  transform: scale(1.05);
52
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  </style>
54
  </head>
55
 
56
  <body>
57
  <div class="container py-5">
58
- <h3>Speech-to-Speech Model Comparison</h3>
59
 
60
  <div id="evaluation-info" class="mb-5">
61
  <p class="text-start">
62
- <strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
 
63
  <br><br>
64
  In this evaluation, you will assess the performance of 4 S2S models:
65
- <strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
66
- <strong>Mini-Omni</strong>.
67
  The goal is to evaluate how well these models handle various speech tasks across different domains.
68
  <br><br>
 
69
  Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm Control</em>),
70
- you will proceed to the evaluation stage. In each round, you will be presented with an audio input.
71
  For example:
72
  <br><br>
73
-
74
- <!-- Left-aligned Audio Sample and Audio Control -->
75
- <span style="vertical-align: middle; line-height: 1.2; display: inline-block;"><strong>Audio Sample:</strong></span>
76
- <audio controls style="vertical-align: middle;">
77
  <source src="/static/audio/sample/input_audio.wav" type="audio/wav">
78
  </audio>
79
-
80
  <br><br>
81
  The corresponding text is:
82
  <em>"Say the following sentence at my speed first, then say it again very slowly:
83
- 'Artificial intelligence is changing the world in many ways.'" </em>
84
  <small>(Note: the audio plays at 1.5x the normal speed.)</small>
85
  <br><br>
 
 
86
  The responses of different S2S models will be provided, and your task is to choose which response best follows
87
- the instructions. For example<small>(Note: During the evaluation process, you will be provided with responses from only the two models that have the most comparative significance.)</small>:
88
  <br><br>
89
-
90
  <!-- ChatGPT-4o Output -->
91
  <span><strong>ChatGPT-4o:</strong></span>
92
- <audio controls style="vertical-align: middle;">
93
  <source src="/static/audio/sample/4o_audio.wav" type="audio/wav">
94
  </audio>
95
  <p class="text-start" style="margin-left: 20px;">
96
- <strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
97
  </p>
98
-
99
  <!-- FunAudioLLM Output -->
100
  <span><strong>FunAudioLLM:</strong></span>
101
- <audio controls style="vertical-align: middle;">
102
  <source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav">
103
  </audio>
104
  <p class="text-start" style="margin-left: 20px;">
105
- <strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
106
  </p>
107
-
108
  <!-- SpeechGPT Output -->
109
  <span><strong>SpeechGPT:</strong></span>
110
- <audio controls style="vertical-align: middle;">
111
  <source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav">
112
  </audio>
113
  <p class="text-start" style="margin-left: 20px;">
114
- <strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Partially followed the instruction, with minor semantic deviation and missing information.
115
  </p>
116
-
117
  <!-- Mini-Omni Output -->
118
  <span><strong>Mini-Omni:</strong></span>
119
- <audio controls style="vertical-align: middle;">
120
  <source src="/static/audio/sample/mini-omni.wav" type="audio/wav">
121
  </audio>
122
  <p class="text-start" style="margin-left: 20px;">
123
- <strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Did not follow the instruction, with significant semantic deviation and missing information.
124
  </p>
125
 
126
  <p class="text-start">
127
- After making your choice, you'll proceed to the next round.
128
  </p>
129
- <strong>Click the button below to start the evaluation!</strong>
130
  </p>
131
  </div>
132
 
133
  <div class="text-center">
134
- <a href="http://71.132.14.167:6002/" target="_blank" class="btn btn-primary">Start Evaluation</a>
135
  </div>
136
  </div>
137
  </body>
 
6
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
7
  <title>Speech-to-Speech Model Comparison</title>
8
  <link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet">
9
+ <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css">
10
  <style>
11
  body {
12
+ background-color: #f0f8ff;
13
  font-family: 'Arial', sans-serif;
14
  }
 
15
  .container {
16
+ background-color: #fff;
17
+ border-radius: 15px;
18
+ box-shadow: 0 6px 15px rgba(0, 0, 0, 0.15);
19
+ padding: 40px;
20
+ max-width: 800px;
21
+ margin: 30px auto;
22
  }
 
23
  h3 {
24
+ font-size: 2rem;
25
  font-weight: bold;
26
  color: #333;
27
  text-align: center;
28
  margin-bottom: 20px;
29
  }
 
30
  p {
31
  color: #555;
32
  font-size: 1rem;
33
+ line-height: 1.8;
34
  }
 
35
  .btn {
36
  border-radius: 25px;
37
+ font-size: 1.1rem;
38
+ padding: 12px 25px;
39
  font-weight: bold;
40
  transition: background-color 0.3s ease, transform 0.2s ease;
41
  }
 
42
  .btn-primary {
43
  background-color: #007bff;
44
  border: none;
45
  }
 
46
  .btn-primary:hover {
47
  background-color: #0056b3;
48
  transform: scale(1.05);
49
  }
50
+ .icon {
51
+ color: #f39c12;
52
+ margin-right: 5px;
53
+ }
54
+ .section-title {
55
+ font-size: 1.2rem;
56
+ font-weight: bold;
57
+ color: #007bff;
58
+ display: flex;
59
+ align-items: center;
60
+ margin-top: 20px;
61
+ }
62
+ .section-title .fa {
63
+ margin-right: 10px;
64
+ }
65
+ audio {
66
+ margin-top: 10px;
67
+ margin-bottom: 15px;
68
+ }
69
  </style>
70
  </head>
71
 
72
  <body>
73
  <div class="container py-5">
74
+ <h3><i class="fas fa-microphone-alt icon"></i>Speech-to-Speech Model Comparison</h3>
75
 
76
  <div id="evaluation-info" class="mb-5">
77
  <p class="text-start">
78
+ <span class="section-title"><i class="fas fa-info-circle"></i>Welcome!</span>
79
+ <strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation! 🎤</strong>
80
  <br><br>
81
  In this evaluation, you will assess the performance of 4 S2S models:
82
+ <strong>ChatGPT-4o</strong> 🤖, <strong>FunAudioLLM</strong> 🎧, <strong>SpeechGPT</strong> 🗣️, and
83
+ <strong>Mini-Omni</strong> 🌟.
84
  The goal is to evaluate how well these models handle various speech tasks across different domains.
85
  <br><br>
86
+ <span class="section-title"><i class="fas fa-tasks"></i>How It Works</span>
87
  Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm Control</em>),
88
+ you will proceed to the evaluation stage. In each round, you will be presented with an audio input. 🎵
89
  For example:
90
  <br><br>
91
+
92
+ <strong>Audio Sample:</strong>
93
+ <audio controls>
 
94
  <source src="/static/audio/sample/input_audio.wav" type="audio/wav">
95
  </audio>
96
+
97
  <br><br>
98
  The corresponding text is:
99
  <em>"Say the following sentence at my speed first, then say it again very slowly:
100
+ 'Artificial intelligence is changing the world in many ways.'" </em> 🧠
101
  <small>(Note: the audio plays at 1.5x the normal speed.)</small>
102
  <br><br>
103
+
104
+ <span class="section-title"><i class="fas fa-star"></i>Model Responses</span>
105
  The responses of different S2S models will be provided, and your task is to choose which response best follows
106
+ the instructions. For example:
107
  <br><br>
108
+
109
  <!-- ChatGPT-4o Output -->
110
  <span><strong>ChatGPT-4o:</strong></span>
111
+ <audio controls>
112
  <source src="/static/audio/sample/4o_audio.wav" type="audio/wav">
113
  </audio>
114
  <p class="text-start" style="margin-left: 20px;">
115
+ <strong>Performance:</strong> 🎙️ Speech: Partially followed the instruction on speed. 🧾 Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
116
  </p>
117
+
118
  <!-- FunAudioLLM Output -->
119
  <span><strong>FunAudioLLM:</strong></span>
120
+ <audio controls>
121
  <source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav">
122
  </audio>
123
  <p class="text-start" style="margin-left: 20px;">
124
+ <strong>Performance:</strong> 🎙️ Speech: Partially followed the instruction on speed. 🧾 Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
125
  </p>
126
+
127
  <!-- SpeechGPT Output -->
128
  <span><strong>SpeechGPT:</strong></span>
129
+ <audio controls>
130
  <source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav">
131
  </audio>
132
  <p class="text-start" style="margin-left: 20px;">
133
+ <strong>Performance:</strong> 🎙️ Speech: Did not follow the instruction on speed. 🧾 Semantics: Partially followed the instruction, with minor semantic deviation and missing information.
134
  </p>
135
+
136
  <!-- Mini-Omni Output -->
137
  <span><strong>Mini-Omni:</strong></span>
138
+ <audio controls>
139
  <source src="/static/audio/sample/mini-omni.wav" type="audio/wav">
140
  </audio>
141
  <p class="text-start" style="margin-left: 20px;">
142
+ <strong>Performance:</strong> 🎙️ Speech: Did not follow the instruction on speed. 🧾 Semantics: Did not follow the instruction, with significant semantic deviation and missing information.
143
  </p>
144
 
145
  <p class="text-start">
146
+ After making your choice, you'll proceed to the next round. 🔄
147
  </p>
148
+ <strong>Click the button below to start the evaluation! 🚀</strong>
149
  </p>
150
  </div>
151
 
152
  <div class="text-center">
153
+ <a href="http://71.132.14.167:6002/" target="_blank" class="btn btn-primary"><i class="fas fa-play"></i> Start Evaluation</a>
154
  </div>
155
  </div>
156
  </body>