Update model name
Browse filesSigned-off-by: Prasoon Varshney <[email protected]>
README.md
CHANGED
@@ -15,13 +15,13 @@ library_name: peft
|
|
15 |
# Model Overview
|
16 |
## Description:
|
17 |
|
18 |
-
**Llama-3.1-NemoGuard-8B-Topic-Control
|
19 |
|
20 |
-
Try out the model here: [Llama-3.1-
|
21 |
|
22 |
Given a system instruction (also called topical instruction, i.e. specifying which topics are allowed and disallowed) and a conversation history ending with the last user prompt, the model returns a binary response that flags if the user message respects the system instruction, (i.e. message is on-topic or a distractor/off-topic).
|
23 |
|
24 |
-
The base large language model (LLM) is the multilingual [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model from Meta. Llama
|
25 |
This model is ready for commercial use. <br>
|
26 |
|
27 |
### License/Terms of Use:
|
@@ -44,7 +44,7 @@ Related paper:
|
|
44 |
|
45 |
## Using the Model
|
46 |
|
47 |
-
Llama
|
48 |
|
49 |
The prompt template consists of two key sections: system instruction and conversation history that includes a sequence of user prompts and LLM responses. Typically, the prompt concludes with the current user query.
|
50 |
|
@@ -103,9 +103,9 @@ The topic control model responds to the final user prompt with a response like `
|
|
103 |
|
104 |
## Integrating with NeMo Guardrails
|
105 |
|
106 |
-
To integrate the topic control model with NeMo Guardrails, you would need access to the NVIDIA NIM container for llama-3.1-
|
107 |
|
108 |
-
NeMo Guardrails uses the LangChain ChatNVIDIA connector to connect to a locally running NIM microservice like llama-3.1-
|
109 |
The topic control microservice exposes the standard OpenAI interface on the `v1/completions` and `v1/chat/completions` endpoints.
|
110 |
|
111 |
NeMo Guardrails simplifies the complexity of building the prompt template, parsing the topic control model responses, and provides a programmable method to build a chatbot with content safety rails.
|
@@ -123,7 +123,7 @@ models:
|
|
123 |
engine: nim
|
124 |
parameters:
|
125 |
base_url: "http://localhost:8000/v1"
|
126 |
-
model_name: "llama-3.1-
|
127 |
|
128 |
rails:
|
129 |
input:
|
@@ -133,7 +133,7 @@ rails:
|
|
133 |
|
134 |
- Field `engine` specifies `nim`.
|
135 |
- Field `parameters.base_url` specifies the IP address and port of the ${__product_long_name} host.
|
136 |
-
- Field `parameters.model_name` in the Guardrails configuration must match the model name served by the
|
137 |
- The rails definition specifies `topic_control` as the model.
|
138 |
|
139 |
Refer to [NVIDIA NeMo Guardrails](https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/overview.html) documentation for more information about the configuration file.
|
@@ -150,7 +150,7 @@ We perform Parameter Efficient FineTuning (PEFT) over the base model using the f
|
|
150 |
|
151 |
**Training Method:**
|
152 |
|
153 |
-
The training method for **Llama-3.1-
|
154 |
- A system instruction which acts like a topical instruction with the rules that define the context of the user-assistant interaction, i.e. topics allowed or disallowed by the current task-oriented scenario, conversation style and tone, conversation flows.
|
155 |
- Any user message in the conversation that respects the topical instruction is considered on-topic, while a user message that contradicts at least one of the rules is a distractor or off-topic.
|
156 |
- A synthetic generated dataset, called CantTalkAboutThis-Mixtral-1.0, of approximately 1,000 multi-turn conversations is used to instruction-tune the base model. Each conversation has a specific topical instruction from various broad domains (i.e. customer support, travel, legal) and contains an entire conversation which is on-topic, together with several distractor user messages replacing some of the on-topic ones at specific key points in the conversation.
|
@@ -228,7 +228,7 @@ off-topic
|
|
228 |
**Preferred/Supported Operating System(s):** Linux (Ubuntu) <br>
|
229 |
|
230 |
## Model Version(s):
|
231 |
-
Llama-3.1-
|
232 |
|
233 |
# Training, Testing, and Evaluation Datasets:
|
234 |
|
|
|
15 |
# Model Overview
|
16 |
## Description:
|
17 |
|
18 |
+
**Llama Nemotron Topic Guard V1**, formerly known as **Llama-3.1-NemoGuard-8B-Topic-Control**, can be used for topical and dialogue moderation of user prompts in human-assistant interactions being designed for task-oriented dialogue agents and custom policy-based moderation.
|
19 |
|
20 |
+
Try out the model here: [Llama-3.1-Nemotron-Topic-Guard-8B-v1](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control)
|
21 |
|
22 |
Given a system instruction (also called topical instruction, i.e. specifying which topics are allowed and disallowed) and a conversation history ending with the last user prompt, the model returns a binary response that flags if the user message respects the system instruction, (i.e. message is on-topic or a distractor/off-topic).
|
23 |
|
24 |
+
The base large language model (LLM) is the multilingual [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model from Meta. Llama Nemotron Topic Guard V1 is LoRa-tuned on a topic-following dataset generated synthetically with [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
25 |
This model is ready for commercial use. <br>
|
26 |
|
27 |
### License/Terms of Use:
|
|
|
44 |
|
45 |
## Using the Model
|
46 |
|
47 |
+
Llama-3.1-Nemotron-Topic-Guard-8B-v1 performs input moderation, such as ensuring that the user prompt is consistent with rules specified as part of the system prompt.
|
48 |
|
49 |
The prompt template consists of two key sections: system instruction and conversation history that includes a sequence of user prompts and LLM responses. Typically, the prompt concludes with the current user query.
|
50 |
|
|
|
103 |
|
104 |
## Integrating with NeMo Guardrails
|
105 |
|
106 |
+
To integrate the topic control model with NeMo Guardrails, you would need access to the NVIDIA NIM container for llama-3.1-nemotron-topic-guard-8b-v1. More information about the NIM container can be found [here](https://docs.nvidia.com/nim/llama-3-1-nemoguard-8b-topiccontrol/latest/getting-started.html).
|
107 |
|
108 |
+
NeMo Guardrails uses the LangChain ChatNVIDIA connector to connect to a locally running NIM microservice like llama-3.1-nemotron-topic-guard-8b-v1.
|
109 |
The topic control microservice exposes the standard OpenAI interface on the `v1/completions` and `v1/chat/completions` endpoints.
|
110 |
|
111 |
NeMo Guardrails simplifies the complexity of building the prompt template, parsing the topic control model responses, and provides a programmable method to build a chatbot with content safety rails.
|
|
|
123 |
engine: nim
|
124 |
parameters:
|
125 |
base_url: "http://localhost:8000/v1"
|
126 |
+
model_name: "llama-3.1-nemotron-topic-guard-8b-v1"
|
127 |
|
128 |
rails:
|
129 |
input:
|
|
|
133 |
|
134 |
- Field `engine` specifies `nim`.
|
135 |
- Field `parameters.base_url` specifies the IP address and port of the ${__product_long_name} host.
|
136 |
+
- Field `parameters.model_name` in the Guardrails configuration must match the model name served by the RESTful endpoint.
|
137 |
- The rails definition specifies `topic_control` as the model.
|
138 |
|
139 |
Refer to [NVIDIA NeMo Guardrails](https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/overview.html) documentation for more information about the configuration file.
|
|
|
150 |
|
151 |
**Training Method:**
|
152 |
|
153 |
+
The training method for **Llama-3.1-Nemotron-Topic-Guard-8B-v1** involves the following concepts:
|
154 |
- A system instruction which acts like a topical instruction with the rules that define the context of the user-assistant interaction, i.e. topics allowed or disallowed by the current task-oriented scenario, conversation style and tone, conversation flows.
|
155 |
- Any user message in the conversation that respects the topical instruction is considered on-topic, while a user message that contradicts at least one of the rules is a distractor or off-topic.
|
156 |
- A synthetic generated dataset, called CantTalkAboutThis-Mixtral-1.0, of approximately 1,000 multi-turn conversations is used to instruction-tune the base model. Each conversation has a specific topical instruction from various broad domains (i.e. customer support, travel, legal) and contains an entire conversation which is on-topic, together with several distractor user messages replacing some of the on-topic ones at specific key points in the conversation.
|
|
|
228 |
**Preferred/Supported Operating System(s):** Linux (Ubuntu) <br>
|
229 |
|
230 |
## Model Version(s):
|
231 |
+
Llama-3.1-Nemotron-Topic-Guard-8B-v1 <br>
|
232 |
|
233 |
# Training, Testing, and Evaluation Datasets:
|
234 |
|