Spaces:
Running
Running
Update templates/index.html
Browse files- templates/index.html +66 -6
templates/index.html
CHANGED
@@ -57,16 +57,76 @@
|
|
57 |
<div class="container py-5">
|
58 |
<h3>Welcome to the Speech-to-Speech Model Evaluation</h3>
|
59 |
|
60 |
-
<div id="evaluation-info" class="mb-
|
61 |
-
<p>
|
62 |
<strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
|
63 |
<br><br>
|
64 |
-
In this evaluation, you will assess the performance of
|
65 |
<strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
|
66 |
-
<strong>Mini-Omni</strong>.
|
|
|
67 |
<br><br>
|
68 |
-
|
69 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
70 |
</p>
|
71 |
</div>
|
72 |
|
|
|
57 |
<div class="container py-5">
|
58 |
<h3>Welcome to the Speech-to-Speech Model Evaluation</h3>
|
59 |
|
60 |
+
<div id="evaluation-info" class="mb-5">
|
61 |
+
<p class="text-start">
|
62 |
<strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
|
63 |
<br><br>
|
64 |
+
In this evaluation, you will assess the performance of 4 S2S models:
|
65 |
<strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
|
66 |
+
<strong>Mini-Omni</strong>.
|
67 |
+
The goal is to evaluate how well these models handle various speech tasks across different domains.
|
68 |
<br><br>
|
69 |
+
Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm Control</em>),
|
70 |
+
you will proceed to the evaluation stage. In each round, you will be presented with an audio input.
|
71 |
+
For example:
|
72 |
+
<br><br>
|
73 |
+
|
74 |
+
<!-- Left-aligned Audio Sample and Audio Control -->
|
75 |
+
<span style="vertical-align: middle; line-height: 1.2; display: inline-block;"><strong>Audio Sample:</strong></span>
|
76 |
+
<audio controls style="vertical-align: middle;">
|
77 |
+
<source src="/static/audio/sample/input_audio.wav" type="audio/wav">
|
78 |
+
</audio>
|
79 |
+
|
80 |
+
<br><br>
|
81 |
+
The corresponding text is:
|
82 |
+
<em>"Say the following sentence at my speed first, then say it again very slowly:
|
83 |
+
'Artificial intelligence is changing the world in many ways.'" </em>
|
84 |
+
<small>(Note: the audio plays at 1.5x the normal speed.)</small>
|
85 |
+
<br><br>
|
86 |
+
The responses of different S2S models will be provided, and your task is to choose which response best follows
|
87 |
+
the instructions. For example<small>(Note: During the evaluation process, you will be provided with responses from only the two models that have the most comparative significance.)</small>:
|
88 |
+
<br><br>
|
89 |
+
|
90 |
+
<!-- ChatGPT-4o Output -->
|
91 |
+
<span><strong>ChatGPT-4o:</strong></span>
|
92 |
+
<audio controls style="vertical-align: middle;">
|
93 |
+
<source src="/static/audio/sample/4o_audio.wav" type="audio/wav">
|
94 |
+
</audio>
|
95 |
+
<p class="text-start" style="margin-left: 20px;">
|
96 |
+
<strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
|
97 |
+
</p>
|
98 |
+
|
99 |
+
<!-- FunAudioLLM Output -->
|
100 |
+
<span><strong>FunAudioLLM:</strong></span>
|
101 |
+
<audio controls style="vertical-align: middle;">
|
102 |
+
<source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav">
|
103 |
+
</audio>
|
104 |
+
<p class="text-start" style="margin-left: 20px;">
|
105 |
+
<strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
|
106 |
+
</p>
|
107 |
+
|
108 |
+
<!-- SpeechGPT Output -->
|
109 |
+
<span><strong>SpeechGPT:</strong></span>
|
110 |
+
<audio controls style="vertical-align: middle;">
|
111 |
+
<source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav">
|
112 |
+
</audio>
|
113 |
+
<p class="text-start" style="margin-left: 20px;">
|
114 |
+
<strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Partially followed the instruction, with minor semantic deviation and missing information.
|
115 |
+
</p>
|
116 |
+
|
117 |
+
<!-- Mini-Omni Output -->
|
118 |
+
<span><strong>Mini-Omni:</strong></span>
|
119 |
+
<audio controls style="vertical-align: middle;">
|
120 |
+
<source src="/static/audio/sample/mini-omni.wav" type="audio/wav">
|
121 |
+
</audio>
|
122 |
+
<p class="text-start" style="margin-left: 20px;">
|
123 |
+
<strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Did not follow the instruction, with significant semantic deviation and missing information.
|
124 |
+
</p>
|
125 |
+
|
126 |
+
<p class="text-start">
|
127 |
+
After making your choice, you'll proceed to the next round.
|
128 |
+
</p>
|
129 |
+
<strong>Please enter your username and start the evaluation!</strong>
|
130 |
</p>
|
131 |
</div>
|
132 |
|