OnT: Language Models as Ontology Encoders
This is an OnT (Ontology Transformer) model trained on the GALEN dataset, based on sentence-transformers/all-mpnet-base-v2. OnT is a language model-based framework for ontology embeddings, enabling effective representation of concepts as points in hyperbolic space and axioms as hierarchical relationships between concepts.
Model Details
Model Description
- Model Type: Ontology Transformer (OnT)
- Base model: sentence-transformers/all-mpnet-base-v2
- Training Dataset: GALEN
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 dimensions
- Embedding Space: Hyperbolic Space
- Key Features:
- Hyperbolic embeddings for ontology concept encoding
- Modeling of hierarchical relationships between concepts
- Support for role embeddings as rotations over hyperbolic spaces
- Concept rotation, transition, and existential quantifier representation
Model Sources
Available Versions
This model is available in 4 versions (Git branches) to suit different use cases:
Branch |
Training Type |
Role Embedding |
Use Case |
main (default) |
Prediction Dataset |
โ
With role embedding |
Default version: training on prediction dataset, support role embedding |
role-free |
Prediction Dataset |
โ Without role embedding |
Training on prediction dataset, without role embedding |
inference-default |
Inference Dataset |
โ
With role embedding |
Training on inference dataset, with role support |
inference-role-free |
Inference Dataset |
โ Without role embedding |
Training on inference dataset, without role embeddings |
How to use different versions:
from OnT import OntologyTransformer
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen")
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen", revision="role-free")
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen", revision="inference-default")
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen", revision="inference-role-free")
Full Model Architecture
OntologyTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Installation
First, install the required dependencies:
pip install sentence-transformers==3.4.0.dev0
You also need to install HierarchyTransformers following the instructions in their repository.
Direct Usage
Load the model and use it for ontology concept encoding:
import torch
from OnT import OntologyTransformer
path = "Hui97/OnT-MPNet-galen"
ont = OntologyTransformer.from_pretrained(path)
entity_names = [
'alveolar atrium',
'organ part',
'superior recess of lesser sac',
]
entity_embeddings = ont.encode_concept(entity_names)
print(entity_embeddings.shape)
role_sentences = [
"application attribute",
"attribute",
"chemical modifier"
]
role_rotations, role_scalings = ont.encode_roles(role_sentences)
Citation
BibTeX
If you use this model, please cite:
@article{yang2025language,
title={Language Models as Ontology Encoders},
author={Yang, Hui and Chen, Jiaoyan and He, Yuan and Gao, Yongsheng and Horrocks, Ian},
journal={arXiv preprint arXiv:2507.14334},
year={2025}
}