Spaces:

ortal1602
/

ARvsFM

Running

App Files Files Community

ortal1602 commited on Jun 11

Commit

1959be2

verified ·

1 Parent(s): 0352417

Update index.html

Browse files

Files changed (1) hide show

index.html +62 -0

index.html CHANGED Viewed

@@ -29,6 +29,68 @@
 </head>
 <body>
 <!-- Section 1 -->
 <div class="container">
   <h1>Fixed Training Configuration</h1>

 </head>
 <body>
+<!-- Hero Section -->
+<div class="container text-center">
+  <img src="figures/ARvsFM.png" alt="AR vs FM" style="width: 100%; border-radius: 20px; box-shadow: 0 4px 16px rgba(0,0,0,0.2); margin-bottom: 20px;">
+  <h1>AR vs FM: A Comparative Study on Audio Modeling Paradigms</h1>
+  <p>
+    <a href="https://scholar.google.com/citations?user=OrTalScholarID" target="_blank">Or Tal</a> ·
+    <a href="https://scholar.google.com/citations?user=FelixKreukID" target="_blank">Felix Kreuk</a> ·
+    <a href="https://scholar.google.com/citations?user=YossiAdiID" target="_blank">Yossi Adi</a>
+  </p>
+</div>
+<!-- Abstract Section -->
+<div class="container">
+  <h2>Abstract</h2>
+  <p>
+    We compare two major paradigms for text-to-music generation—Auto-Regressive (AR) and Flow-Matching (FM)—under tightly controlled settings. All models are trained from scratch with the same dataset, representations, and backbone architecture. We evaluate on fidelity, control adherence, inpainting, inference efficiency, and robustness to training scale. Our results reveal clear trade-offs: AR achieves better fidelity and control accuracy, while FM enables faster inference and smoother inpainting. This study helps guide future decisions in music generation research and development.
+  </p>
+</div>
+<!-- Interactive Highlight Slider -->
+<div class="container">
+  <h2>Paper Highlights</h2>
+  <div id="highlight-box" style="text-align: center; padding: 30px; border: 1px solid #ddd; border-radius: 10px; background: #fafafa;">
+    <p id="highlight-text" style="font-size: 1.2rem; font-style: italic;"></p>
+  </div>
+  <div class="text-center mt-3">
+    <button onclick="prevHighlight()" class="btn btn-outline-primary">← Prev</button>
+    <button onclick="nextHighlight()" class="btn btn-outline-primary">Next →</button>
+  </div>
+</div>
+<script>
+  const highlights = [
+    "🎯 AR achieves better text-to-music fidelity and is more robust to frame-rate changes.",
+    "🧠 AR follows temporally-aligned control signals (chords, melody, drums) more accurately.",
+    "🪄 FM (supervised) produces the smoothest transitions in inpainting; AR has lowest FAD but audible seams.",
+    "🚀 FM can be faster, but only at the cost of quality (needs fewer steps). AR scales better with batch size.",
+    "🧪 FM achieves near-topline performance with smaller batches; AR improves steadily with more training steps.",
+    "🎧 Both AR and FM lose fidelity when conditioned with strict temporal controls—highlighting a trade-off between control and quality."
+  ];
+  let highlightIndex = 0;
+  const highlightText = document.getElementById('highlight-text');
+  function showHighlight(index) {
+    highlightText.textContent = highlights[index];
+  }
+  function prevHighlight() {
+    highlightIndex = (highlightIndex - 1 + highlights.length) % highlights.length;
+    showHighlight(highlightIndex);
+  }
+  function nextHighlight() {
+    highlightIndex = (highlightIndex + 1) % highlights.length;
+    showHighlight(highlightIndex);
+  }
+  // Initialize
+  showHighlight(highlightIndex);
+</script>
 <!-- Section 1 -->
 <div class="container">
   <h1>Fixed Training Configuration</h1>