Update README.md
Browse files
README.md
CHANGED
@@ -21,12 +21,12 @@ tags:
|
|
21 |
<p>
|
22 |
|
23 |
<p align="center">
|
24 |
-
Kimi-Audio-7B
|
25 |
</p>
|
26 |
|
27 |
## Introduction
|
28 |
|
29 |
-
We present Kimi-Audio, an open-source audio foundation model excelling in **audio understanding, generation, and conversation**. This repository hosts the model checkpoints for Kimi-Audio-7B
|
30 |
|
31 |
Kimi-Audio is designed as a universal audio foundation model capable of handling a wide variety of audio processing tasks within a single unified framework. Key features include:
|
32 |
|
@@ -41,14 +41,24 @@ For more details, please refer to our [GitHub Repository](https://github.com/Moo
|
|
41 |
|
42 |
## Requirements
|
43 |
|
44 |
-
|
45 |
-
|
46 |
```bash
|
47 |
git clone https://github.com/MoonshotAI/Kimi-Audio
|
48 |
cd Kimi-Audio
|
49 |
-
|
|
|
|
|
|
|
|
|
50 |
```
|
51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
## Quickstart
|
53 |
|
54 |
This example demonstrates basic usage for generating text from audio (ASR) and generating both text and speech in a conversational turn using the `Kimi-Audio-7B-Instruct` model.
|
|
|
21 |
<p>
|
22 |
|
23 |
<p align="center">
|
24 |
+
Kimi-Audio-7B-Instruct <a href="https://huggingface.co/moonshotai/Kimi-Audio-7B-Instruct">🤗</a> | 📑 <a href="https://raw.githubusercontent.com/MoonshotAI/Kimi-Audio/main/assets/kimia_report.pdf">Paper</a>
|
25 |
</p>
|
26 |
|
27 |
## Introduction
|
28 |
|
29 |
+
We present Kimi-Audio, an open-source audio foundation model excelling in **audio understanding, generation, and conversation**. This repository hosts the model checkpoints for Kimi-Audio-7B-Instruct.
|
30 |
|
31 |
Kimi-Audio is designed as a universal audio foundation model capable of handling a wide variety of audio processing tasks within a single unified framework. Key features include:
|
32 |
|
|
|
41 |
|
42 |
## Requirements
|
43 |
|
44 |
+
We recommend that you build a Docker image to run the inference. After cloning the inference code, you can construct the image using the `docker build` command.
|
|
|
45 |
```bash
|
46 |
git clone https://github.com/MoonshotAI/Kimi-Audio
|
47 |
cd Kimi-Audio
|
48 |
+
docker build -t kimi-audio:v0.1 .
|
49 |
+
```
|
50 |
+
Alternatively, You can also use our pre-built image:
|
51 |
+
```bash
|
52 |
+
docker pull moonshotai/kimi-audio:v0.1
|
53 |
```
|
54 |
|
55 |
+
Or, you can install requirments by:
|
56 |
+
```bash
|
57 |
+
pip install -r requirements.txt
|
58 |
+
```
|
59 |
+
|
60 |
+
You may refer to the Dockerfile in case of any environment issues.
|
61 |
+
|
62 |
## Quickstart
|
63 |
|
64 |
This example demonstrates basic usage for generating text from audio (ASR) and generating both text and speech in a conversational turn using the `Kimi-Audio-7B-Instruct` model.
|