Update README.md
Browse files
README.md
CHANGED
@@ -72,15 +72,14 @@ Extensive evaluations confirm AF3’s effectiveness, setting new benchmarks on o
|
|
72 |
|
73 |
**This model is for non-commercial research purposes only.**
|
74 |
|
|
|
|
|
|
|
75 |
## Model Architecture:
|
76 |
Audio Flamingo 3 uses AF-Whisper unified audio encoder, MLP-based audio adaptor, Decoder-only LLM backbone (Qwen2.5-7B), and Streaming TTS module (AF3-Chat). Audio Flamingo 3 can take up to 10 minutes of audio inputs.
|
77 |
|
78 |
-
<center><img src="static/af3_radial-1.png" width="400"></center>
|
79 |
-
|
80 |
-
## Results:
|
81 |
<center><img src="static/af3_main_diagram-1.png" width="800"></center>
|
82 |
|
83 |
-
|
84 |
## License / Terms of Use
|
85 |
The model is released under the [NVIDIA OneWay Noncommercial License](static/NVIDIA_OneWay_Noncommercial_License.docx). Portions of the dataset generation are also subject to the [Qwen Research License](https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE) and OpenAI’s [Terms of Use](https://openai.com/policies/terms-of-use).
|
86 |
|
|
|
72 |
|
73 |
**This model is for non-commercial research purposes only.**
|
74 |
|
75 |
+
## Results:
|
76 |
+
<center><img src="static/af3_radial-1.png" width="400"></center>
|
77 |
+
|
78 |
## Model Architecture:
|
79 |
Audio Flamingo 3 uses AF-Whisper unified audio encoder, MLP-based audio adaptor, Decoder-only LLM backbone (Qwen2.5-7B), and Streaming TTS module (AF3-Chat). Audio Flamingo 3 can take up to 10 minutes of audio inputs.
|
80 |
|
|
|
|
|
|
|
81 |
<center><img src="static/af3_main_diagram-1.png" width="800"></center>
|
82 |
|
|
|
83 |
## License / Terms of Use
|
84 |
The model is released under the [NVIDIA OneWay Noncommercial License](static/NVIDIA_OneWay_Noncommercial_License.docx). Portions of the dataset generation are also subject to the [Qwen Research License](https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE) and OpenAI’s [Terms of Use](https://openai.com/policies/terms-of-use).
|
85 |
|