ChuxiJ commited on
Commit
2d83c51
·
verified ·
1 Parent(s): f3a05ca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +159 -3
README.md CHANGED
@@ -1,3 +1,159 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - music
5
+ - text2music
6
+ pipeline_tag: text-to-audio
7
+ language:
8
+ - en
9
+ - zh
10
+ - de
11
+ - fr
12
+ - es
13
+ - it
14
+ - pt
15
+ - pl
16
+ - tr
17
+ - ru
18
+ - cs
19
+ - nl
20
+ - ar
21
+ - ja
22
+ - hu
23
+ - ko
24
+ - hi
25
+ library_name: diffusers
26
+ ---
27
+
28
+ # 🎤 Chinese Rap LoRA for ACE-Step (Rap Machine)
29
+
30
+ This is a hybrid rap voice model. We meticulously curated Chinese rap/hip-hop datasets for training, with rigorous data cleaning and recaptioning. The results demonstrate:
31
+
32
+ - Improved Chinese pronunciation accuracy
33
+ - Enhanced stylistic adherence to hip-hop and electronic genres
34
+ - Greater diversity in hip-hop vocal expressions
35
+
36
+ ## Usage Guide
37
+
38
+ 1. Generate higher-quality Chinese songs
39
+ 2. Create superior hip-hop tracks
40
+ 3. Blend with other genres to:
41
+ - Produce music with better vocal quality and detail
42
+ - Add experimental flavors (e.g., underground, street culture)
43
+ 4. Fine-tune using these parameters:
44
+
45
+ **Vocal Controls**
46
+ **`vocal_timbre`**
47
+ - Examples: Bright, dark, warm, cold, breathy, nasal, gritty, smooth, husky, metallic, whispery, resonant, airy, smoky, sultry, light, clear, high-pitched, raspy, powerful, ethereal, flute-like, hollow, velvety, shrill, hoarse, mellow, thin, thick, reedy, silvery, twangy.
48
+ - Describes inherent vocal qualities.
49
+
50
+ **`techniques`** (List)
51
+ - Rap styles: `mumble rap`, `chopper rap`, `melodic rap`, `lyrical rap`, `trap flow`, `double-time rap`
52
+ - Vocal FX: `auto-tune`, `reverb`, `delay`, `distortion`
53
+ - Delivery: `whispered`, `shouted`, `spoken word`, `narration`, `singing`
54
+ - Other: `ad-libs`, `call-and-response`, `harmonized`
55
+
56
+ ## Community Note
57
+
58
+ While a Chinese rap LoRA might seem niche for non-Chinese communities, we consistently demonstrate through such projects that ACE-step - as a music generation foundation model - holds boundless potential. It doesn't just improve pronunciation in one language, but spawns new styles.
59
+
60
+ The universal human appreciation of music is a precious asset. Like abstract LEGO blocks, these elements will eventually combine in more organic ways. May our open-source contributions propel the evolution of musical history forward.
61
+
62
+ ---
63
+
64
+ # ACE-Step: A Step Towards Music Generation Foundation Model
65
+
66
+ ![ACE-Step Framework](https://github.com/ACE-Step/ACE-Step/raw/main/assets/ACE-Step_framework.png)
67
+
68
+ ## Model Description
69
+
70
+ ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holistic architectural design. It integrates diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer, achieving state-of-the-art performance in generation speed, musical coherence, and controllability.
71
+
72
+ **Key Features:**
73
+ - 15× faster than LLM-based baselines (20s for 4-minute music on A100)
74
+ - Superior musical coherence across melody, harmony, and rhythm
75
+ - full-song generation, duration control and accepts natural language descriptions
76
+
77
+ ## Uses
78
+
79
+ ### Direct Use
80
+ ACE-Step can be used for:
81
+ - Generating original music from text descriptions
82
+ - Music remixing and style transfer
83
+ - edit song lyrics
84
+
85
+ ### Downstream Use
86
+ The model serves as a foundation for:
87
+ - Voice cloning applications
88
+ - Specialized music generation (rap, jazz, etc.)
89
+ - Music production tools
90
+ - Creative AI assistants
91
+
92
+ ### Out-of-Scope Use
93
+ The model should not be used for:
94
+ - Generating copyrighted content without permission
95
+ - Creating harmful or offensive content
96
+ - Misrepresenting AI-generated music as human-created
97
+
98
+ ## How to Get Started
99
+
100
+ see: https://github.com/ace-step/ACE-Step
101
+
102
+ ## Hardware Performance
103
+
104
+ | Device | 27 Steps | 60 Steps |
105
+ |---------------|----------|----------|
106
+ | NVIDIA A100 | 27.27x | 12.27x |
107
+ | RTX 4090 | 34.48x | 15.63x |
108
+ | RTX 3090 | 12.76x | 6.48x |
109
+ | M2 Max | 2.27x | 1.03x |
110
+
111
+ *RTF (Real-Time Factor) shown - higher values indicate faster generation*
112
+
113
+
114
+ ## Limitations
115
+
116
+ - Performance varies by language (top 10 languages perform best)
117
+ - Longer generations (>5 minutes) may lose structural coherence
118
+ - Rare instruments may not render perfectly
119
+ - Output Inconsistency: Highly sensitive to random seeds and input duration, leading to varied "gacha-style" results.
120
+ - Style-specific Weaknesses: Underperforms on certain genres (e.g. Chinese rap/zh_rap) Limited style adherence and musicality ceiling
121
+ - Continuity Artifacts: Unnatural transitions in repainting/extend operations
122
+ - Vocal Quality: Coarse vocal synthesis lacking nuance
123
+ - Control Granularity: Needs finer-grained musical parameter control
124
+
125
+ ## Ethical Considerations
126
+
127
+ Users should:
128
+ - Verify originality of generated works
129
+ - Disclose AI involvement
130
+ - Respect cultural elements and copyrights
131
+ - Avoid harmful content generation
132
+
133
+
134
+ ## Model Details
135
+
136
+ **Developed by:** ACE Studio and StepFun
137
+ **Model type:** Diffusion-based music generation with transformer conditioning
138
+ **License:** Apache 2.0
139
+ **Resources:**
140
+ - [Project Page](https://ace-step.github.io/)
141
+ - [Demo Space](https://huggingface.co/spaces/ACE-Step/ACE-Step)
142
+ - [GitHub Repository](https://github.com/ACE-Step/ACE-Step)
143
+
144
+
145
+ ## Citation
146
+
147
+ ```bibtex
148
+ @misc{gong2025acestep,
149
+ title={ACE-Step: A Step Towards Music Generation Foundation Model},
150
+ author={Junmin Gong, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
151
+ howpublished={\url{https://github.com/ace-step/ACE-Step}},
152
+ year={2025},
153
+ note={GitHub repository}
154
+ }
155
+ ```
156
+
157
+ ## Acknowledgements
158
+ This project is co-led by ACE Studio and StepFun.
159
+