metadata

license: apache-2.0
datasets:
  - loubb/aria-midi
language:
  - en
tags:
  - music
  - MIDI
  - piano

Model

Aria is a pretrained autoregressive generative model for symbolic music based on the LLaMA 3.2 (1B) architecture. It was trained on ~60k hours of MIDI transcriptions of expressive solo-piano recordings. It has been finetuned to produce realistic continuations of solo-piano compositions as well as to produce general-purpose contrastive MIDI embeddings.

This HuggingFace page contains weights and usage instructions for the pretrained base model. For the generative model, see aria-medium-gen, and for the embedding model, see aria-medium-embedding.

📖 Read our paper
🚀 Check out the real-time demo in the official GitHub repository
📊 Get access to our training dataset Aria-MIDI to train your own models

Usage Guidelines

Intended Use

The model is most naturally suited for generating continuations of existing MIDI files rather than generating music from scratch (unconditioned). While the tokenizer and model checkpoint technically support multi-track tokenization and generation, multi-track music comprises a minority of the training data, so we highly recommend performing inference with single-track piano MIDI files for optimal results.

Data Memorization Considerations

Due to overrepresentation of performances of popular compositions (e.g., those from well-known classical composers such as Chopin) and difficulties in completely deduplicating the training data, some of these compositions have been compositionally memorized by the model. We suggest performing inference with lesser-known compositions or your own music for more original results.

Input Quality Considerations

Since the model has not been post-trained with any instruction tuning or RLHF (similar to pre-instruct GPT models), it is very sensitive to input quality and performs best when prompted with well-played music. To get sample MIDI files, see the example-prompts/ directory or explore the Aria-MIDI dataset.

Quickstart

All of our models were trained using MIDI tooling and tokenizer accessible in the aria-utils repository. Install the aria-utils package with pip:

pip install git+https://github.com/EleutherAI/aria-utils.git

You can then generate a continuation to a truncated (piano) MIDI file using the transformers library:

pip install transformers
pip install torch

from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

PROMPT_MIDI_LOAD_PATH = "mydir/prompt.midi"
CONTINUATION_MIDI_SAVE_PATH = "mydir/continuation.midi"

model = AutoModelForCausalLM.from_pretrained(
    "loubb/aria-medium-base",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "loubb/aria-medium-base",
    trust_remote_code=True,
)

prompt = tokenizer.encode_from_file(
    PROMPT_MIDI_LOAD_PATH, return_tensors="pt"
)

continuation = model.generate(
    prompt.input_ids[..., :512],
    max_length=2048,  
    do_sample=True,  
    temperature=0.97,  
    top_p=0.95,
    use_cache=True,
)

midi_dict = tokenizer.decode(continuation[0].tolist())
midi_dict.to_midi().save(CONTINUATION_MIDI_SAVE_PATH)

License and Attribution

The Aria project has been kindly supported by EleutherAI, Stability AI, as well as by a compute grant from the Ministry of Science and ICT of Korea. Our models and MIDI tooling are released under the Apache-2.0 license. If you use the models or tooling for follow-up work, please cite the paper in which they were introduced:

@inproceedings{bradshawscaling,
  title={Scaling Self-Supervised Representation Learning for Symbolic Piano Performance},
  author={Bradshaw, Louis and Fan, Honglu and Spangher, Alexander and Biderman, Stella and Colton, Simon},
  booktitle={arXiv preprint},
  year={2025},
  url={https://arxiv.org/abs/2506.23869}
}