KurtDu commited on
Commit
373bffb
·
verified ·
1 Parent(s): 09c824e

Update templates/index.html

Browse files
Files changed (1) hide show
  1. templates/index.html +66 -6
templates/index.html CHANGED
@@ -57,16 +57,76 @@
57
  <div class="container py-5">
58
  <h3>Welcome to the Speech-to-Speech Model Evaluation</h3>
59
 
60
- <div id="evaluation-info" class="mb-4">
61
- <p>
62
  <strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
63
  <br><br>
64
- In this evaluation, you will assess the performance of various S2S models, such as
65
  <strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
66
- <strong>Mini-Omni</strong>. The goal is to evaluate how well these models handle various speech tasks across different domains.
 
67
  <br><br>
68
- You will listen to audio inputs and evaluate the models' outputs based on their ability to follow instructions.
69
- Get ready to explore the cutting-edge of speech technology!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  </p>
71
  </div>
72
 
 
57
  <div class="container py-5">
58
  <h3>Welcome to the Speech-to-Speech Model Evaluation</h3>
59
 
60
+ <div id="evaluation-info" class="mb-5">
61
+ <p class="text-start">
62
  <strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
63
  <br><br>
64
+ In this evaluation, you will assess the performance of 4 S2S models:
65
  <strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
66
+ <strong>Mini-Omni</strong>.
67
+ The goal is to evaluate how well these models handle various speech tasks across different domains.
68
  <br><br>
69
+ Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm Control</em>),
70
+ you will proceed to the evaluation stage. In each round, you will be presented with an audio input.
71
+ For example:
72
+ <br><br>
73
+
74
+ <!-- Left-aligned Audio Sample and Audio Control -->
75
+ <span style="vertical-align: middle; line-height: 1.2; display: inline-block;"><strong>Audio Sample:</strong></span>
76
+ <audio controls style="vertical-align: middle;">
77
+ <source src="/static/audio/sample/input_audio.wav" type="audio/wav">
78
+ </audio>
79
+
80
+ <br><br>
81
+ The corresponding text is:
82
+ <em>"Say the following sentence at my speed first, then say it again very slowly:
83
+ 'Artificial intelligence is changing the world in many ways.'" </em>
84
+ <small>(Note: the audio plays at 1.5x the normal speed.)</small>
85
+ <br><br>
86
+ The responses of different S2S models will be provided, and your task is to choose which response best follows
87
+ the instructions. For example<small>(Note: During the evaluation process, you will be provided with responses from only the two models that have the most comparative significance.)</small>:
88
+ <br><br>
89
+
90
+ <!-- ChatGPT-4o Output -->
91
+ <span><strong>ChatGPT-4o:</strong></span>
92
+ <audio controls style="vertical-align: middle;">
93
+ <source src="/static/audio/sample/4o_audio.wav" type="audio/wav">
94
+ </audio>
95
+ <p class="text-start" style="margin-left: 20px;">
96
+ <strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
97
+ </p>
98
+
99
+ <!-- FunAudioLLM Output -->
100
+ <span><strong>FunAudioLLM:</strong></span>
101
+ <audio controls style="vertical-align: middle;">
102
+ <source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav">
103
+ </audio>
104
+ <p class="text-start" style="margin-left: 20px;">
105
+ <strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
106
+ </p>
107
+
108
+ <!-- SpeechGPT Output -->
109
+ <span><strong>SpeechGPT:</strong></span>
110
+ <audio controls style="vertical-align: middle;">
111
+ <source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav">
112
+ </audio>
113
+ <p class="text-start" style="margin-left: 20px;">
114
+ <strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Partially followed the instruction, with minor semantic deviation and missing information.
115
+ </p>
116
+
117
+ <!-- Mini-Omni Output -->
118
+ <span><strong>Mini-Omni:</strong></span>
119
+ <audio controls style="vertical-align: middle;">
120
+ <source src="/static/audio/sample/mini-omni.wav" type="audio/wav">
121
+ </audio>
122
+ <p class="text-start" style="margin-left: 20px;">
123
+ <strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Did not follow the instruction, with significant semantic deviation and missing information.
124
+ </p>
125
+
126
+ <p class="text-start">
127
+ After making your choice, you'll proceed to the next round.
128
+ </p>
129
+ <strong>Please enter your username and start the evaluation!</strong>
130
  </p>
131
  </div>
132