ArticleAgent: Constraint-Driven Qwen2.5-1.5B for Academic Concept Path Extraction

This repository hosts ArticleAgent, a fine-tuned Qwen2.5-1.5B-Instruct model designed to extract structured concept paths from academic paper abstracts. The model is part of the research presented in:

Constraint-Driven Small Language Models Based on Agent and OpenAlex Knowledge Graph: Mining Conceptual Pathways and Discovering Innovation Points in Academic Papers
Ziye Xia, Sergei S. Ospichev (2025)

The system leverages a four-stage agent framework grounded in the OpenAlex knowledge graph, combining prompt engineering, knowledge constraints, and human-in-the-loop validation to achieve high-precision concept extraction and novelty detection.

πŸ” Key Features

  • Extracts structured concept paths (e.g., Physics β†’ Condensed Matter β†’ Superconductivity)
  • Identifies innovation points based on rare structural combinations of mainstream concepts
  • Integrates OpenAlex concept taxonomy as external knowledge constraint
  • Trained on 7,960 papers from Novosibirsk State University (NSU)
  • Achieves 97.24% precision and 91.46% F1-score in end-to-end concept path extraction

πŸš€ Usage

You can load the model directly using Hugging Face Transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "Hengzongshu/ArticleAgent"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="bfloat16",
    trust_remote_code=True
)

# Example input (Stage 2: Concept Pair Extraction)
input_text = """<research_methods>... your abstract segment ...</research_methods>"""
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
11
Safetensors
Model size
1.54B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Hengzongshu/ArticleAgent

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1206)
this model

Dataset used to train Hengzongshu/ArticleAgent