ibm-granite
/

granite-3.2-8b-lora-rag-hallucination-detection

PEFT

Safetensors

Model card Files Files and versions Community

cguna commited on 6 days ago

Commit

17e2ad8

verified ·

1 Parent(s): 07dcfdf

Fixes to Hallucination detection readme

Browse files

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ This is a RAG-specific LoRA adapter for [ibm-granite/granite-3.2-8b-instruct](ht
 This is a LoRA adapter that gives the ability to identify hallucination risks for the sentences in the last assistant response in a multi-turn RAG conversation based on a set of provided documents/passages.
 > [!TIP]
-> Note: While you can invoke the LoRA adapter directly, as outlined below, we highly recommend calling it through granite-io, which wraps it with a tailored I/O processor. The I/O processor provides a friendlier interface, as it takes care of various data transformations and validation tasks. This includes among others, splitting the input documents and assistant response into sentences before calling the adapter, as well as validating the adapters output and transforming the sentence IDs returned by the adapter into appropriate spans over the documents and the response.
 However, if you prefer to invoke the LoRA adapter directly, the expected input/output is described below.
@@ -34,7 +34,7 @@ However, if you prefer to invoke the LoRA adapter directly, the expected input/o
 - **conversation**: A list of conversational turns between the user and the assistant, where each item in the list is a dictionary with fields `role` and `content`. The `role` equals to either `user` or `assistant`, denoting user and assistant turns, respectively, while the `content` field contains the corresponding user/assistant utterance. The conversation should end with an assistant turn and the `text` field of that turn should contain the assistant utterance with each sentence prefixed with a response id of the form `<rI>`, where `I` is an integer. The numbering should start from 0 (for the first sentence) and be incremented by one for each subsequent sentence in the last assistant turn.
 - **documents**: A list of documents, where each item in the list is a dictionary with fields `doc_id` and `text`. The `text` field contains the text of the corresponding document.
-Additionally this LoRA adapter is trained with a task instruction, which is encoded as a dictionary with fields `role` and `content`, where `role` equals to `system` and `content` equals to the following string describing the citation generation task: `Split the last assistant response into individual sentences. For each sentence in the last assistant response, identify the faithfulness score range. Ensure that your output includes all response sentence IDs, and for each response sentence ID, provide the corresponding faithfulness score range. The output must be a json structure.`
 To prompt the LoRA adapter, we combine the above components as follows: We first append the **instruction** to the end of the **conversation** to generate an **input_conversation** list. Then we invoke the `apply_chat_template` function with parameters: conversation = **augmented_conversation** and documents = **documents**.
@@ -45,7 +45,7 @@ To prompt the LoRA adapter, we combine the above components as follows: We first
 As explained above, it is highly recommended to use the LoRA adapter through granite-io [ADD LINK].
-However, if you prefer to invoke the LoRA adapter directly, you can use the following code. Note that the code assumes that the documents and the last assistant response have been already split into sentences.
 ```
 import torch

 This is a LoRA adapter that gives the ability to identify hallucination risks for the sentences in the last assistant response in a multi-turn RAG conversation based on a set of provided documents/passages.
 > [!TIP]
+> Note: While you can invoke the LoRA adapter directly, as outlined below, we highly recommend calling it through granite-io, which wraps it with a tailored I/O processor. The I/O processor provides a friendlier interface, as it takes care of various data transformations and validation tasks. This includes among others, splitting the assistant response into sentences before calling the adapter, as well as validating the adapters output and transforming the sentence IDs returned by the adapter into appropriate spans over the the response.
 However, if you prefer to invoke the LoRA adapter directly, the expected input/output is described below.
 - **conversation**: A list of conversational turns between the user and the assistant, where each item in the list is a dictionary with fields `role` and `content`. The `role` equals to either `user` or `assistant`, denoting user and assistant turns, respectively, while the `content` field contains the corresponding user/assistant utterance. The conversation should end with an assistant turn and the `text` field of that turn should contain the assistant utterance with each sentence prefixed with a response id of the form `<rI>`, where `I` is an integer. The numbering should start from 0 (for the first sentence) and be incremented by one for each subsequent sentence in the last assistant turn.
 - **documents**: A list of documents, where each item in the list is a dictionary with fields `doc_id` and `text`. The `text` field contains the text of the corresponding document.
+Additionally this LoRA adapter is trained with a task instruction, which is encoded as a dictionary with fields `role` and `content`, where `role` equals to `system` and `content` equals to the following string describing the hallucination detection  task: `Split the last assistant response into individual sentences. For each sentence in the last assistant response, identify the faithfulness score range. Ensure that your output includes all response sentence IDs, and for each response sentence ID, provide the corresponding faithfulness score range. The output must be a json structure.`
 To prompt the LoRA adapter, we combine the above components as follows: We first append the **instruction** to the end of the **conversation** to generate an **input_conversation** list. Then we invoke the `apply_chat_template` function with parameters: conversation = **augmented_conversation** and documents = **documents**.
 As explained above, it is highly recommended to use the LoRA adapter through granite-io [ADD LINK].
+However, if you prefer to invoke the LoRA adapter directly, you can use the following code.
 ```
 import torch