This is the collections of COIG-P's models

Multimodal Art Projection
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Multimodal Art Projection (M-A-P) is an open-source AI research community.
The community members are working on research topics in a wide range of spectrum, including but not limited to pre-training paradigm of foundation models, large-scale data collection and processing, and the derived applciations on coding, reasoning and music creativity.
The community is open to researchers keen on any relevant topic. Welcome to join us!
- Discord Channel
- Our Full Paper List
- mail: [email protected]
The development log of our Multimodal Art Projection (m-a-p) model family:
- π₯28/01/2025: We release YuE (δΉ), the most powerful open-source foundation models for music generation, specifically for transforming lyrics into full songs (lyrics2song), like Suno.ai. See demos.
- π₯08/05/2024: We release the fully transparent large language model MAP-Neo, series models for scaling law exploraltion and post-training alignment, and along with the training corpus Matrix.
- π₯11/04/2024: MuPT paper and demo are out. HF collection.
- π₯08/04/2024: Chinese Tiny LLM is out. HF collection.
- π₯28/02/2024: The release of ChatMusician's demo, code, model, data, and benchmark. π
- π₯23/02/2024: The release of OpenCodeInterpreter, beats GPT-4 code interpreter on HumanEval.
- 23/01/2024: we release CMMMU for better Chinese LMMs' Evaluation.
- 13/01/2024: we release a series of Music Pretrained Transformer (MuPT) checkpoints, with size up to 1.3B and 8192 context length. Our models are LLAMA2-based, pre-trained on world's largest 10B tokens symbolic music dataset (ABC notation format). We currently support Megatron-LM format and will release huggingface checkpoints soon.
- 02/06/2023: officially release the MERT pre-print paper and training codes.
- 17/03/2023: we release two advanced music understanding models, MERT-v1-95M and MERT-v1-330M , trained with new paradigm and dataset. They outperform the previous models and can better generalize to more tasks.
- 14/03/2023: we retrained the MERT-v0 model with open-source-only music dataset MERT-v0-public
- 29/12/2022: a music understanding model MERT-v0 trained with MLM paradigm, which performs better at downstream tasks.
- 29/10/2022: a pre-trained MIR model music2vec trained with BYOL paradigm.
Collections
16
This is the collection of COIG-P's datasets
-
m-a-p/COIG-P
Viewer β’ Updated β’ 1.01M β’ 714 β’ 17 -
m-a-p/COIG-P-CRM
Viewer β’ Updated β’ 484k β’ 232 β’ 3 -
m-a-p/COIG-CRBench
Viewer β’ Updated β’ 1.04k β’ 107 -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper β’ 2504.05535 β’ Published β’ 44
spaces
4
models
120

m-a-p/Infinity-Instruct-3M-0625-Llama3-8B-COIG-P
Text Generation
β’
Updated
β’
6

m-a-p/Qwen2-Instruct-7B-COIG-P
Text Generation
β’
Updated
β’
5

m-a-p/Qwen2.5-Instruct-7B-COIG-P
Text Generation
β’
Updated
β’
9

m-a-p/CRM_llama3
Text Classification
β’
Updated
β’
11

m-a-p/Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P
Text Generation
β’
Updated
β’
5

m-a-p/Infinity-Instruct-3M-0625-Mistral-7B-COIG-P
Text Generation
β’
Updated
β’
8

m-a-p/openl2s_sep_ckpts
Updated

m-a-p/YuE-s1-0.5B
Text Generation
β’
Updated
β’
16
β’
3

m-a-p/YuE-upsampler
Text Generation
β’
Updated
β’
3k
β’
20

m-a-p/YuE-s2-1B-general
Text Generation
β’
Updated
β’
18k
β’
50
datasets
44
m-a-p/PIN-100M
Viewer
β’
Updated
β’
68.1k
β’
72.9k
β’
10
m-a-p/SuperGPQA
Viewer
β’
Updated
β’
26.5k
β’
1.56k
β’
63
m-a-p/SimpleVQA
Viewer
β’
Updated
β’
5.22k
β’
241
β’
1
m-a-p/IV-Bench
Viewer
β’
Updated
β’
747
β’
104
m-a-p/COIG-P
Viewer
β’
Updated
β’
1.01M
β’
714
β’
17
m-a-p/COIG-CRBench
Viewer
β’
Updated
β’
1.04k
β’
107
m-a-p/COIG-P-CRM
Viewer
β’
Updated
β’
484k
β’
232
β’
3
m-a-p/SuperGPQA-Records
Viewer
β’
Updated
β’
911k
β’
508
m-a-p/OmniInstruct_v1
Viewer
β’
Updated
β’
96.1k
β’
559
β’
2
m-a-p/CodeCriticBench
Preview
β’
Updated
β’
35
β’
3