Llama3-8B-Instruct Fine-tuned for QED (Question-Explanation-Data)

Model Description

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct specifically adapted for the QED (Question-Explanation-Data) task. The model has been trained to provide structured explanations for question answering by generating three key components simultaneously: direct answers, supporting sentences, and referential entity mappings.

Task Overview

The QED task, introduced in "A Framework and Dataset for Explanations in Question Answering", requires models to:

Answer Extraction: Identify the shortest span from a passage that directly answers a given question
Evidence Selection: Select the single sentence from the passage that best entails or implies the answer
Referential Mapping: Establish connections between entities mentioned in the question and their corresponding references in the selected sentence

Fine-tuning Details

Base Model: meta-llama/Meta-Llama-3-8B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation) with rank=16, alpha=32
Quantization: 4-bit quantization for memory efficiency
Training Strategy: Few-shot learning with "random_two" example prompting
Training Data: Curated subset of QED training examples
Output Format: Structured JSON containing answer, selected_sentence, and referential_equalities

Performance Improvements

Significant improvements over the base model on QED evaluation metrics:

Metric	Base Model (Zero-shot)	Fine-tuned Model	Improvement
Exact Match Accuracy	0.9%	11.8%	+10.9%
Answer Accuracy	82.0%	86.4%	+4.4%
All Mention F1	5.5%	38.4%	+32.9%
Question Mention F1	6.0%	47.6%	+41.6%
Context Mention F1	5.0%	29.2%	+24.2%

Results based on 0.5 F1 overlap threshold, non-strict matching

Training Code & Methodology

This model was trained using our comprehensive QED fine-tuning framework available on GitHub:

🔗 QED Fine-Tuning Framework

Usage

The model expects input in a specific format and outputs structured JSON:

# Input format
prompt = """
Title: [Document Title]
Question: [Your Question]
Passage: [Context Passage]

You are an expert at extracting answers and structured explanations from text.
Your response MUST be **valid JSON only** (no extra commentary).

Task
====
Given:
• a **title** for the passage,
• a **question** about the passage, and
• the **context passage** itself,

produce an explanation object with three parts:

1. "answer" – the **shortest span** from the passage that fully answers the question.
2. "selected_sentence" – the **single sentence** in the passage that entails or implies the answer.
3. "referential_equalities" – a list of mappings between phrases in the question and phrases in the selected sentence
   that refer to the **same real-world entity/event**.

   • Each mapping has two keys:
       - "question_reference": the exact phrase from the question (**must be a contiguous substring from the question,
          not from the context or title**).
       - "sentence_reference": the exact phrase from the selected sentence (**must be a contiguous substring from the selected sentence,
          not from the question or title**), or "" (empty string if the entire sentence is the referent).

     ▸ Use **""** for "sentence_reference" when the entity/event is not named by any specific phrase in the sentence –
       i.e. the entire sentence acts as the referent (a *bridge* to the whole sentence).  
       This corresponds to the (start = end = -1) convention in the QED dataset.

Output format
=============
Return **only** JSON in this exact schema:

{
  "answer": "<string from passage>",
  "selected_sentence": "<string from passage>",
  "referential_equalities": [
    {
      "question_reference": "<string from question only>",
      "sentence_reference": "<string from selected_sentence only, or "">",
      "bridge": "<false if not a bridge; otherwise, a string explaining the bridge connection, e.g., 'in', 'for', 'of', 'at', 'on'>"
    }
    ...
  ]
}
"""

# Expected output format
{
  "answer": "<shortest span from passage>",
  "selected_sentence": "<sentence that entails the answer>",
  "referential_equalities": [
    {
      "question_reference": "<entity from question>",
      "sentence_reference": "<corresponding entity from sentence>",
      "bridge": false
    }
  ]
}

Evaluation

Evaluated on the QED development set with official metrics across multiple overlap thresholds (0.5-0.9). The model shows consistent improvements in all measured aspects of the QED task, particularly excelling at entity reference mapping and answer extraction.

Training Details

Dataset: QED training subset with careful example curation
Learning Rate: 5e-6 with warmup ratio of 0.2
Batch Size: Effective batch size of 16 through gradient accumulation
Optimizer: Paged AdamW 8-bit for memory efficiency
Evaluation: Multi-threshold validation (0.5-0.9 F1 overlap)
Epochs: 3 Epochs

Applications

This model is particularly suitable for:

Educational question answering systems requiring explanations
Research applications needing interpretable QA
Systems where answer provenance and entity tracking are important
Building more transparent and accountable AI assistants

Citation

Please cite the original QED work when using this model:

@article{lamm2020qed,
  title={QED: A Framework and Dataset for Explanations in Question Answering},
  author={Lamm, Matthew and Palomaki, Jennimaria and Alberti, Chris and Andor, Daniel and Chen, Eunsol and Devlin, Jacob and Michael, Julian},
  journal={arXiv preprint arXiv:2010.13806},
  year={2020}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DenisRz/llama3_8b_instruct_qed

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(1393)

this model