Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: question-answering
|
|
6 |
---
|
7 |
# Model Description
|
8 |
This is a fine-tuned version of the Minerva model, trained on the [Medical Meadow Flashcard Dataset](https://huggingface.co/datasets/medalpaca/medical_meadow_medical_flashcards) for question answering. The model was developed by the Sapienza NLP Team in collaboration with Future Artificial Intelligence Research (FAIR) and CINECA; specifically, I used the version with 350 million parameters due to computational limits, though versions with 1 billion and 3 billion parameters also exist. For more details, please refer to their repositories: [Sapienza NLP on Hugging Face](https://huggingface.co/sapienzanlp) and [Minerva LLMs](https://nlp.uniroma1.it/minerva/).
|
9 |
-
|
10 |
# Issues and possible Solutions
|
11 |
- In the original fine-tuned version, my model tended to generate answers that continued unnecessarily, leading to repeated sentences and a degradation in quality over time. Parameters like '*max_length*' or '*max_new_tokens*' were ineffective as they merely stopped the generation at a specified point without properly concluding the sentence. To address this issue, I redefined the stopping criteria to terminate the generation at the first period ('.'), as demonstrated in the code below:
|
12 |
- ```python
|
@@ -32,6 +32,7 @@ This is a fine-tuned version of the Minerva model, trained on the [Medical Meado
|
|
32 |
inputText = tokenizer.decode(inputEncoding.input_ids[0], skip_special_tokens = True)
|
33 |
answer = outputText[len(inputText):].strip()
|
34 |
```
|
|
|
35 |
|
36 |
# Use Example
|
37 |
|
@@ -59,6 +60,7 @@ This is a fine-tuned version of the Minerva model, trained on the [Medical Meado
|
|
59 |
# Generated Answer: Wernicke encephalopathy is caused by a defect in the Wern-Herxheimer reaction, which leads to an accumulation of acid and alkaline phosphatase activity.
|
60 |
# Effective Answer: The underlying pathophysiologic cause of Wernicke encephalopathy is thiamine (B1) deficiency.
|
61 |
```
|
|
|
62 |
|
63 |
# Training Information
|
64 |
The model was fine-tuned for 3 epochs using the parameters specified in its original repository:
|
|
|
6 |
---
|
7 |
# Model Description
|
8 |
This is a fine-tuned version of the Minerva model, trained on the [Medical Meadow Flashcard Dataset](https://huggingface.co/datasets/medalpaca/medical_meadow_medical_flashcards) for question answering. The model was developed by the Sapienza NLP Team in collaboration with Future Artificial Intelligence Research (FAIR) and CINECA; specifically, I used the version with 350 million parameters due to computational limits, though versions with 1 billion and 3 billion parameters also exist. For more details, please refer to their repositories: [Sapienza NLP on Hugging Face](https://huggingface.co/sapienzanlp) and [Minerva LLMs](https://nlp.uniroma1.it/minerva/).
|
9 |
+
<br>
|
10 |
# Issues and possible Solutions
|
11 |
- In the original fine-tuned version, my model tended to generate answers that continued unnecessarily, leading to repeated sentences and a degradation in quality over time. Parameters like '*max_length*' or '*max_new_tokens*' were ineffective as they merely stopped the generation at a specified point without properly concluding the sentence. To address this issue, I redefined the stopping criteria to terminate the generation at the first period ('.'), as demonstrated in the code below:
|
12 |
- ```python
|
|
|
32 |
inputText = tokenizer.decode(inputEncoding.input_ids[0], skip_special_tokens = True)
|
33 |
answer = outputText[len(inputText):].strip()
|
34 |
```
|
35 |
+
<br>
|
36 |
|
37 |
# Use Example
|
38 |
|
|
|
60 |
# Generated Answer: Wernicke encephalopathy is caused by a defect in the Wern-Herxheimer reaction, which leads to an accumulation of acid and alkaline phosphatase activity.
|
61 |
# Effective Answer: The underlying pathophysiologic cause of Wernicke encephalopathy is thiamine (B1) deficiency.
|
62 |
```
|
63 |
+
<br>
|
64 |
|
65 |
# Training Information
|
66 |
The model was fine-tuned for 3 epochs using the parameters specified in its original repository:
|