🔗 GLiNER-relex: Generalist and Lightweight Model for Joint Zero-Shot NER and Relation Extraction

GLiNER-relex is a unified model for zero-shot Named Entity Recognition (NER) and Relation Extraction (RE) that performs both tasks simultaneously in a single forward pass. Built on the GLiNER architecture, it extends the span-based approach to jointly identify entities and extract relationships between them.

✨ Key Features

Joint Extraction: Simultaneously extracts entities and relations in one forward pass
Zero-Shot: No fine-tuning required - specify entity types and relation types at inference time
Efficient: Single encoder architecture processes both tasks together
Flexible: Supports custom entity and relation schemas per inference call
Production-Ready: ONNX export support for deployment

📦 Installation

First, install the GLiNER library:

pip install gliner -U

🚀 Quick Start

Basic Usage

from gliner import GLiNER

# Load the model
model = GLiNER.from_pretrained("knowledgator/gliner-relex-large-v0.5")

# Define your entity types and relation types
entity_labels = ["location", "person", "date", "structure"]
relation_labels = ["located in", "designed by", "completed in"]

# Input text
text = "The Eiffel Tower, located in Paris, France, was designed by engineer Gustave Eiffel and completed in 1889."

# Run inference - returns both entities and relations
entities, relations = model.inference(
    texts=[text],
    labels=entity_labels,
    relations=relation_labels,
    threshold=0.5,
    adjacency_threshold=0.55,
    relation_threshold=0.8,
    return_relations=True,
    flat_ner=False
)

# Print entities
print("Entities:")
for entity in entities[0]:
    print(f"  {entity['text']} -> {entity['label']} (score: {entity['score']:.3f})")

# Print relations
print("\nRelations:")
for relation in relations[0]:
    head = relation['head']['text']
    tail = relation['tail']['text']
    rel_type = relation['relation']
    score = relation['score']
    print(f"  {head} --[{rel_type}]--> {tail} (score: {score:.3f})")

Expected Output:

Entities:
  Eiffel Tower -> structure (score: 0.912)
  Paris -> location (score: 0.934)
  France -> location (score: 0.891)
  Gustave Eiffel -> person (score: 0.923)
  1889 -> date (score: 0.856)

Relations:
  Eiffel Tower --[located in]--> Paris (score: 0.823)
  Eiffel Tower --[designed by]--> Gustave Eiffel (score: 0.847)
  Eiffel Tower --[completed in]--> 1889 (score: 0.789)

Batch Processing

texts = [
    "Elon Musk founded SpaceX in Hawthorne, California.",
    "Microsoft, led by Satya Nadella, acquired GitHub in 2018.",
    "The Louvre Museum in Paris houses the Mona Lisa."
]

entity_labels = ["person", "organization", "location", "artwork"]
relation_labels = ["founder of", "CEO of", "located in", "acquired", "houses"]

entities, relations = model.inference(
    texts=texts,
    labels=entity_labels,
    relations=relation_labels,
    threshold=0.5,
    relation_threshold=0.5,
    batch_size=8,
    return_relations=True,
    flat_ner=False
)

for i, (text_entities, text_relations) in enumerate(zip(entities, relations)):
    print(f"\nText {i + 1}:")
    print(f"  Entities: {[e['text'] for e in text_entities]}")
    print(f"  Relations: {[(r['head']['text'], r['relation'], r['tail']['text']) for r in text_relations]}")

Entity-Only Extraction

If you only need entities without relations:

entities = model.inference(
    texts=[text],
    labels=entity_labels,
    relations=[],  # Empty list for relations
    threshold=0.5,
    return_relations=False,  # Skip relation extraction
    flat_ner=False
)

⚙️ Advanced Configuration

Adjusting Thresholds

You can fine-tune extraction sensitivity with separate thresholds:

entities, relations = model.inference(
    texts=texts,
    labels=entity_labels,
    relations=relation_labels,
    threshold=0.5,              # Entity confidence threshold
    adjacency_threshold=0.6,    # Threshold for entity pair candidates
    relation_threshold=0.7,     # Relation classification threshold
    flat_ner=True,              # Enforce non-overlapping entities
    multi_label=False,          # Single label per entity span
    return_relations=True,
    flat_ner=False
)

We recommend lowering the threshold (entity extraction threshold) and keeping it in the range of 0.3–0.5. For adjacency_threshold, the model provides good results in the 0.5–0.65 range. For relation_threshold, use larger values like 0.7–0.9. Feel free to adjust all of these values based on your project requirements.

📊 Output Format

Entity Format

{
    "start": int,      # Start character position
    "end": int,        # End character position  
    "text": str,       # Entity text span
    "label": str,      # Entity type
    "score": float     # Confidence score (0-1)
}

Relation Format

{
    "head": {
        "start": int,
        "end": int,
        "text": str,
        "type": str,
        "entity_idx": int  # Index in entities list
    },
    "tail": {
        "start": int,
        "end": int,
        "text": str,
        "type": str,
        "entity_idx": int
    },
    "relation": str,   # Relation type
    "score": float     # Confidence score (0-1)
}

🏗️ Architecture

GLiNER-relex uses a unified encoder architecture that:

Encodes text and labels jointly using a transformer backbone.
Identifies entity spans using span-based classification.
Constructs an adjacency matrix to identify potential entity pairs using graph convolutional networks.
Classifies relations between selected entity pairs.

This joint approach allows the model to leverage entity information when extracting relations, leading to more coherent predictions.

📚 Use Cases

Knowledge Graph Construction: Extract structured facts from unstructured text
Information Extraction Pipelines: Build end-to-end IE systems
Document Understanding: Extract entities and their relationships from documents
Question Answering: Power QA systems with structured knowledge
Data Enrichment: Automatically annotate text corpora

Downloads last month: 65