RS2002
/

PianoBART

Safetensors

Model card Files Files and versions

xet

Community

RS2002 commited on Aug 17

Commit

0597d3d

verified ·

1 Parent(s): e715770

Update README.md

Browse files

Files changed (1) hide show

README.md +94 -9

README.md CHANGED Viewed

@@ -1,9 +1,94 @@
----
-tags:
-- model_hub_mixin
-- pytorch_model_hub_mixin
----
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

+# PianoBART
+The description is generated by Grok3.
+## Model Details
+- **Model Name**: PianoBART
+- **Model Type**: Transformer-based model (BART architecture) for symbolic piano music generation and understanding
+- **Version**: 1.0
+- **Release Date**: August 2025
+- **Developers**:  Zijian Zhao, Weichao Zeng, Yutong He, Fupeng He, Yiyi Wang
+- **Organization**: SYSU
+- **License**: Apache License 2.0
+- **Paper**: [PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training](https://ieeexplore.ieee.org/document/10688332), ICME 2024
+- Citation:
+  ```
+  @INPROCEEDINGS{10688332,
+    author={Liang, Xiao and Zhao, Zijian and Zeng, Weichao and He, Yutong and He, Fupeng and Wang, Yiyi and Gao, Chengying},
+    booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
+    title={PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training},
+    year={2024},
+    volume={},
+    number={},
+    pages={1-6},
+    doi={10.1109/ICME57554.2024.10688332}
+  }
+  ```
+- **Contact**: [email protected]
+- **Repository**: https://github.com/RS2002/PianoBart
+## Model Description
+PianoBART is a transformer-based model built on the Bidirectional and Auto-Regressive Transformers (BART) architecture, designed for symbolic piano music generation and understanding. It leverages large-scale pre-training to perform tasks such as music generation, composer classification, emotion classification, velocity prediction, and melody prediction. The model processes symbolic music data in an octuple format and is inspired by frameworks like [MusicBERT](https://github.com/microsoft/muzic/tree/main/musicbert) and [MidiBERT-Piano](https://github.com/wazenmai/MIDI-BERT).
+- **Architecture**: BART (encoder-decoder transformer)
+- **Input Format**: Octuple representation of symbolic music (batch_size, sequence_length, 8) for both encoder and decoder
+- **Output Format**: Hidden states of dimension [batch_size, sequence_length, 1024]
+- **Hidden Size**: 1024
+- **Training Objective**: Pre-training with large-scale datasets followed by task-specific fine-tuning
+- **Tasks Supported**: Music generation, composer classification, emotion classification, velocity prediction, melody prediction
+## Training Data
+The model was pre-trained and fine-tuned on the following datasets:
+- **Pre-training**: POP1K7, ASAP, POP909, Pianist8, EMOPIA
+- **Generation**: Maestro, GiantMIDI
+- **Composer Classification**: ASAP, Pianist8
+- **Emotion Classification**: EMOPIA
+- **Velocity Prediction**: GiantMIDI
+- **Melody Prediction**: POP909
+For dataset preprocessing and organization, refer to the [MusicBERT](https://github.com/microsoft/muzic/tree/main/musicbert) and [MidiBERT-Piano](https://github.com/wazenmai/MIDI-BERT) repositories.
+## Usage
+### Installation
+```shell
+git clone https://huggingface.co/RS2002/PianoBART
+```
+Please ensure that the `model.py` and `Octuple.pkl` files are located in the same folder.
+### Example Code
+```python
+import torch
+from model import PianoBART
+# Load the model
+model = PianoBART.from_pretrained("RS2002/PianoBART")
+# Example input
+input_ids_encoder = torch.randint(1, 10, (2, 1024, 8))
+input_ids_decoder = torch.randint(1, 10, (2, 1024, 8))
+encoder_attention_mask = torch.zeros((2, 1024))
+decoder_attention_mask = torch.zeros((2, 1024))
+# Forward pass
+output = model(input_ids_encoder, input_ids_decoder, encoder_attention_mask, decoder_attention_mask)
+print(output.last_hidden_state.size())  # Output: [2, 1024, 1024]
+```