Llama3-8B-Instruct Fine-tuned for QED (Question-Explanation-Data)

Model Description

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct specifically adapted for the QED (Question-Explanation-Data) task. The model has been trained to provide structured explanations for question answering by generating three key components simultaneously: direct answers, supporting sentences, and referential entity mappings.

Task Overview

The QED task, introduced in "A Framework and Dataset for Explanations in Question Answering", requires models to:

  1. Answer Extraction: Identify the shortest span from a passage that directly answers a given question
  2. Evidence Selection: Select the single sentence from the passage that best entails or implies the answer
  3. Referential Mapping: Establish connections between entities mentioned in the question and their corresponding references in the selected sentence

Fine-tuning Details

  • Base Model: meta-llama/Meta-Llama-3-8B-Instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation) with rank=16, alpha=32
  • Quantization: 4-bit quantization for memory efficiency
  • Training Strategy: Few-shot learning with "random_two" example prompting
  • Training Data: Curated subset of QED training examples
  • Output Format: Structured JSON containing answer, selected_sentence, and referential_equalities

Performance Improvements

Significant improvements over the base model on QED evaluation metrics:

Metric Base Model (Zero-shot) Fine-tuned Model Improvement
Exact Match Accuracy 0.9% 11.8% +10.9%
Answer Accuracy 82.0% 86.4% +4.4%
All Mention F1 5.5% 38.4% +32.9%
Question Mention F1 6.0% 47.6% +41.6%
Context Mention F1 5.0% 29.2% +24.2%

Results based on 0.5 F1 overlap threshold, non-strict matching

Training Code & Methodology

This model was trained using our comprehensive QED fine-tuning framework available on GitHub:

πŸ”— QED Fine-Tuning Framework

Usage

The model expects input in a specific format and outputs structured JSON:

# Input format
prompt = """
Title: [Document Title]
Question: [Your Question]
Passage: [Context Passage]

You are an expert at extracting answers and structured explanations from text.
Your response MUST be **valid JSON only** (no extra commentary).

Task
====
Given:
β€’ a **title** for the passage,
β€’ a **question** about the passage, and
β€’ the **context passage** itself,

produce an explanation object with three parts:

1. "answer" – the **shortest span** from the passage that fully answers the question.
2. "selected_sentence" – the **single sentence** in the passage that entails or implies the answer.
3. "referential_equalities" – a list of mappings between phrases in the question and phrases in the selected sentence
   that refer to the **same real-world entity/event**.

   β€’ Each mapping has two keys:
       - "question_reference": the exact phrase from the question (**must be a contiguous substring from the question,
          not from the context or title**).
       - "sentence_reference": the exact phrase from the selected sentence (**must be a contiguous substring from the selected sentence,
          not from the question or title**), or "" (empty string if the entire sentence is the referent).

     β–Έ Use **""** for "sentence_reference" when the entity/event is not named by any specific phrase in the sentence –
       i.e. the entire sentence acts as the referent (a *bridge* to the whole sentence).  
       This corresponds to the (start = end = -1) convention in the QED dataset.

Output format
=============
Return **only** JSON in this exact schema:

{
  "answer": "<string from passage>",
  "selected_sentence": "<string from passage>",
  "referential_equalities": [
    {
      "question_reference": "<string from question only>",
      "sentence_reference": "<string from selected_sentence only, or "">",
      "bridge": "<false if not a bridge; otherwise, a string explaining the bridge connection, e.g., 'in', 'for', 'of', 'at', 'on'>"
    }
    ...
  ]
}
"""

# Expected output format
{
  "answer": "<shortest span from passage>",
  "selected_sentence": "<sentence that entails the answer>",
  "referential_equalities": [
    {
      "question_reference": "<entity from question>",
      "sentence_reference": "<corresponding entity from sentence>",
      "bridge": false
    }
  ]
}

Evaluation

Evaluated on the QED development set with official metrics across multiple overlap thresholds (0.5-0.9). The model shows consistent improvements in all measured aspects of the QED task, particularly excelling at entity reference mapping and answer extraction.

Training Details

  • Dataset: QED training subset with careful example curation
  • Learning Rate: 5e-6 with warmup ratio of 0.2
  • Batch Size: Effective batch size of 16 through gradient accumulation
  • Optimizer: Paged AdamW 8-bit for memory efficiency
  • Evaluation: Multi-threshold validation (0.5-0.9 F1 overlap)
  • Epochs: 3 Epochs

Applications

This model is particularly suitable for:

  • Educational question answering systems requiring explanations
  • Research applications needing interpretable QA
  • Systems where answer provenance and entity tracking are important
  • Building more transparent and accountable AI assistants

Citation

Please cite the original QED work when using this model:

@article{lamm2020qed,
  title={QED: A Framework and Dataset for Explanations in Question Answering},
  author={Lamm, Matthew and Palomaki, Jennimaria and Alberti, Chris and Andor, Daniel and Chen, Eunsol and Devlin, Jacob and Michael, Julian},
  journal={arXiv preprint arXiv:2010.13806},
  year={2020}
}

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for DenisRz/llama3_8b_instruct_qed

Adapter
(1393)
this model