Model Card for Model ID
A LoRA made for the ORKG Ask information extraction usecase. It expects a list of properties and the source to extract them from
The model also supports 13 languages and 4 diffierent language tones.
Model Details
Finetuned using unsloth
with a dataset created using larger LLMs like GPT-4o.
Languages supported are:
- English
- Spanish
- German
- Dutch
- French
- Italian
- Portuguese
- Russian
- Chinese
- Japanese
- Korean
- Arabic
- Farsi
Language tones supported:
- Researcher
- Adult
- Teenager
- Child
Model Description
The language tones are described as follows:
Child (10โ11 years old):
- Simple, short sentences and basic accurate explanations.
- No advanced jargons.
- Everyday examples that tie into the research findings.
Teenager:
- Casual, engaging manner; relevant slang moderately.
- Interesting and emotional research findings.
- Relatable explanations, referencing everyday scenarios or pop culture where applicable.
Adult:
- Concise details yet with a polished, clear tone.
- Moderate, non-technical vocabulary where possible.
- Essential context and logical flow, focusing on practical applications of research.
Researcher:
- Formal, precise language with clear references to methodologies or data.
- Discipline-specific terminology as needed.
- Balanced, objective presentation of research complexities.
The system prompt of the model is:
You will get as an input: a research paper's content and a set of properties/criteria to extract.
You will extract the values corresponding to the list of provided predicates.
You should stick to the schema and description of the properties (if available).
Use the ORKG Ask structured information extraction XML output format.
The extractions must be in the "{language}" language and the complexity of the language should be for a "{tone}".
The user prompt should look like this:
# Specifications of what to extract:
{properties}
# Extraction source:
{source}
Properties look like this:
[{'label': 'Methods', 'desc': 'The methods used in the study', 'schema': 'multiple values of type string'}, {'label': 'TL;DR', 'desc': null, 'schema': null}]
The output of the model is created in XML format
<extractions>
<extraction property="Methods">
<values>
<value>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</value>
<value>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</value>
</values>
<sources>
<source>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</source>
<source>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</source>
</sources>
</extraction>
<extraction property="Conclusions">
<values>
<value>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</value>
<value>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</value>
</values>
<sources>
<source>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</source>
<source>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</source>
</sources>
</extraction>
<extraction property="Limitations">
<values></values>
<sources></sources>
</extraction>
<extraction property="TL;DR">
<values>
<value>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</value>
</values>
<sources></sources>
</extraction>
</extractions>
The model should be used in chat mode or use the chat template (check tokenizer) and feed it to a normal generation endpoint.
Trainging Details
LoRA details
r=16
finetune_vision_layers=False
finetune_language_layers=True
finetune_attention_modules=True
finetune_mlp_modules=True
lora_alpha=32
lora_dropout=0
seed=42
SFT details
per_device_train_batch_size=4
gradient_accumulation_steps=16
warmup_steps=3
num_train_epochs=2
learning_rate=2e-4
bf16=True
optim="adamw_8bit"
weight_decay=0.01
lr_scheduler_type="linear"
seed=42
Trained on responses only!!
Model Card Contact
ORKG Ask Team - [email protected]
Framework versions
- PEFT 0.14.0
- Downloads last month
- 9
Model tree for yaser-j/llama3.2-3b-unsloth-bnb-ie-adapter
Base model
meta-llama/Llama-3.2-3B-Instruct