Uploaded model

  • Developed by: Marcos GΓ΄lo
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen2.5-32B-Instruct-bnb-4bit

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

πŸ“„ Model Card: aksw/text2sparql-L

🧠 Model Overview

text2sparql-L is Large a fine-tuned language model designed to translate natural language questions into SPARQL queries, specifically targeting the DBpedia knowledge graph (2014 version). It is ideal for knowledge-based QA systems and symbolic reasoning agents.


πŸ” Intended Use

  • Input: Natural language questions (e.g., "Which actors were born in Germany?")
  • Output: A single string containing the corresponding SPARQL query.

🧩 Applications

  • Question Answering systems over open knowledge bases (DBpedia)
  • Semantic conversational agents
  • Knowledge graph exploration tools
  • Autonomous agents with symbolic reasoning capabilities

βš™οΈ Model Details

  • Base model: Qwen2.5 32B (via Unsloth)
  • Training: Dataset with 15.000 question-query examples built by joining 4 datasets:
    • QLAD-1
    • LCQUAD-1
    • ParaQA
    • Question-Sparql
  • Target Ontology: DBpedia Ontology (2014)
  • Frameworks: Unsloth, HuggingFace, Transformers

πŸ“¦ Installation

Make sure to install unsloth, torch and CUDA dependencies:

pip install unsloth torch

πŸ§ͺ Example: Inference Code

from unsloth import FastLanguageModel
import torch

class SPARQLQueryGenerator:
    def __init__(self, model_name: str, max_seq_length: int = 2048, load_in_4bit: bool = True):
        self.model, self.tokenizer = FastLanguageModel.from_pretrained(
            model_name=model_name,
            max_seq_length=max_seq_length,
            load_in_4bit=load_in_4bit
        )
        _ = FastLanguageModel.for_inference(self.model)

    def build_prompt(self, question: str) -> list:
        return [
            {"role": "system", "content": (
                "You are an expert data analyst with deep knowledge of SPARQL and the DBpedia ontology.\n"
                "Your task is to convert a given natural language question into a syntactically correct DBpedia SPARQL query "
                "that accurately retrieves the answer.\n"
                "Your output must be a single string containing only the SPARQL queryβ€”no additional text, explanation, or commentary.\n"
                "Ensure that you use the appropriate DBpedia prefixes and follow standard SPARQL syntax."
            )},
            {"role": "user", "content": question}
        ]

    def generate_query(self, question: str, temperature: float = 0.01, max_new_tokens: int = 1024) -> str:
        messages = self.build_prompt(question)
        inputs = self.tokenizer.apply_chat_template(
            messages,
            tokenize=True,
            add_generation_prompt=True,
            return_tensors="pt"
        ).to("cuda")

        outputs = self.model.generate(
            input_ids=inputs,
            max_new_tokens=max_new_tokens,
            use_cache=True,
            temperature=temperature,
            min_p=0.1
        )

        decoded = self.tokenizer.batch_decode(outputs)[0]
        return self._extract_sparql(decoded)

    def _extract_sparql(self, decoded_text: str) -> str:
        start_token = "<|im_start|>assistant\n"
        end_token = "<|im_end|>"
        start_index = decoded_text.find(start_token) + len(start_token)
        sparql = decoded_text[start_index:]
        return sparql.rstrip(end_token) if sparql.endswith(end_token) else sparql

# --- Using the model ---
if __name__ == "__main__":
    generator = SPARQLQueryGenerator(model_name="aksw/text2sparql-L")
    question = "Which actors were born in Germany?"
    query = generator.generate_query(question)
    print(query)

🧠 Example Input / Output

Input:

Which actors were born in Germany?

Output:

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX res: <http://dbpedia.org/resource/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?uri WHERE {
    ?uri rdf:type dbo:Actor .
    ?uri dbo:birthPlace res:Germany .
}

πŸ§ͺ Evaluation

The model was evaluated using F1-score on a hand-crafted dataset for the First Text2SPARQL Challenge, a Co-Located with Text2KG at ESWC25.


πŸ“š Citation

If you use this model in your work, please cite it as:

@misc{text2sparql2025,
  author = {Marcos GΓ΄lo, Paulo do Carmo, Edgard Marx, Ricardo Marcacini},
  title = {text2SPARQL-L: Natural Language Text to SPARQL for DBpedia},
  year = {2025},
  howpublished = {\url{https://huggingface.co/aksw/text2sparql-L}},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for aksw/text2sparql-L

Base model

Qwen/Qwen2.5-32B
Finetuned
(32)
this model