π GLiNER-relex: Generalist and Lightweight Model for Joint Zero-Shot NER and Relation Extraction
GLiNER-relex is a unified model for zero-shot Named Entity Recognition (NER) and Relation Extraction (RE) that performs both tasks simultaneously in a single forward pass. Built on the GLiNER architecture, it extends the span-based approach to jointly identify entities and extract relationships between them.
β¨ Key Features
- Joint Extraction: Simultaneously extracts entities and relations in one forward pass
- Zero-Shot: No fine-tuning required - specify entity types and relation types at inference time
- Efficient: Single encoder architecture processes both tasks together
- Flexible: Supports custom entity and relation schemas per inference call
- Production-Ready: ONNX export support for deployment
π¦ Installation
First, install the GLiNER library:
pip install gliner -U
π Quick Start
Basic Usage
from gliner import GLiNER
# Load the model
model = GLiNER.from_pretrained("knowledgator/gliner-relex-large-v0.5")
# Define your entity types and relation types
entity_labels = ["location", "person", "date", "structure"]
relation_labels = ["located in", "designed by", "completed in"]
# Input text
text = "The Eiffel Tower, located in Paris, France, was designed by engineer Gustave Eiffel and completed in 1889."
# Run inference - returns both entities and relations
entities, relations = model.inference(
texts=[text],
labels=entity_labels,
relations=relation_labels,
threshold=0.5,
adjacency_threshold=0.55,
relation_threshold=0.8,
return_relations=True,
flat_ner=False
)
# Print entities
print("Entities:")
for entity in entities[0]:
print(f" {entity['text']} -> {entity['label']} (score: {entity['score']:.3f})")
# Print relations
print("\nRelations:")
for relation in relations[0]:
head = relation['head']['text']
tail = relation['tail']['text']
rel_type = relation['relation']
score = relation['score']
print(f" {head} --[{rel_type}]--> {tail} (score: {score:.3f})")
Expected Output:
Entities:
Eiffel Tower -> structure (score: 0.912)
Paris -> location (score: 0.934)
France -> location (score: 0.891)
Gustave Eiffel -> person (score: 0.923)
1889 -> date (score: 0.856)
Relations:
Eiffel Tower --[located in]--> Paris (score: 0.823)
Eiffel Tower --[designed by]--> Gustave Eiffel (score: 0.847)
Eiffel Tower --[completed in]--> 1889 (score: 0.789)
Batch Processing
texts = [
"Elon Musk founded SpaceX in Hawthorne, California.",
"Microsoft, led by Satya Nadella, acquired GitHub in 2018.",
"The Louvre Museum in Paris houses the Mona Lisa."
]
entity_labels = ["person", "organization", "location", "artwork"]
relation_labels = ["founder of", "CEO of", "located in", "acquired", "houses"]
entities, relations = model.inference(
texts=texts,
labels=entity_labels,
relations=relation_labels,
threshold=0.5,
relation_threshold=0.5,
batch_size=8,
return_relations=True,
flat_ner=False
)
for i, (text_entities, text_relations) in enumerate(zip(entities, relations)):
print(f"\nText {i + 1}:")
print(f" Entities: {[e['text'] for e in text_entities]}")
print(f" Relations: {[(r['head']['text'], r['relation'], r['tail']['text']) for r in text_relations]}")
Entity-Only Extraction
If you only need entities without relations:
entities = model.inference(
texts=[text],
labels=entity_labels,
relations=[], # Empty list for relations
threshold=0.5,
return_relations=False, # Skip relation extraction
flat_ner=False
)
βοΈ Advanced Configuration
Adjusting Thresholds
You can fine-tune extraction sensitivity with separate thresholds:
entities, relations = model.inference(
texts=texts,
labels=entity_labels,
relations=relation_labels,
threshold=0.5, # Entity confidence threshold
adjacency_threshold=0.6, # Threshold for entity pair candidates
relation_threshold=0.7, # Relation classification threshold
flat_ner=True, # Enforce non-overlapping entities
multi_label=False, # Single label per entity span
return_relations=True,
flat_ner=False
)
We recommend lowering the threshold (entity extraction threshold) and keeping it in the range of 0.3β0.5. For adjacency_threshold, the model provides good results in the 0.5β0.65 range. For relation_threshold, use larger values like 0.7β0.9. Feel free to adjust all of these values based on your project requirements.
π Output Format
Entity Format
{
"start": int, # Start character position
"end": int, # End character position
"text": str, # Entity text span
"label": str, # Entity type
"score": float # Confidence score (0-1)
}
Relation Format
{
"head": {
"start": int,
"end": int,
"text": str,
"type": str,
"entity_idx": int # Index in entities list
},
"tail": {
"start": int,
"end": int,
"text": str,
"type": str,
"entity_idx": int
},
"relation": str, # Relation type
"score": float # Confidence score (0-1)
}
ποΈ Architecture
GLiNER-relex uses a unified encoder architecture that:
- Encodes text and labels jointly using a transformer backbone.
- Identifies entity spans using span-based classification.
- Constructs an adjacency matrix to identify potential entity pairs using graph convolutional networks.
- Classifies relations between selected entity pairs.
This joint approach allows the model to leverage entity information when extracting relations, leading to more coherent predictions.
π Use Cases
- Knowledge Graph Construction: Extract structured facts from unstructured text
- Information Extraction Pipelines: Build end-to-end IE systems
- Document Understanding: Extract entities and their relationships from documents
- Question Answering: Power QA systems with structured knowledge
- Data Enrichment: Automatically annotate text corpora
- Downloads last month
- 65