Model Card for DeepSeek-R1-Distill-Llama-8B

This model card aims to provide detailed information about the DeepSeek-R1-Distill-Llama-8B model, a fine-tuned version of Llama-8B for medical conversational AI tasks.

Model Details

Model Description

The DeepSeek-R1-Distill-Llama-8B is a conversational AI model fine-tuned on the thu-coai/esconv dataset to generate responses for medical conversational tasks. The model is optimized for processing and generating human-like dialogues in a medical context. It uses LoRA (Low-Rank Adaptation) fine-tuning, ensuring efficient and high-quality performance in generating medical responses.

Developed by: [Insha Sadaf]
Model type: Conversational AI Model for medical applications
Language(s) (NLP): English
License: MIT
Finetuned from model: Llama-8B (Distilled to DeepSeek-R1)

Uses

Direct Use

The model can be used directly for generating responses in medical conversations, helping provide information on medical topics, diagnosing issues (based on conversation), or aiding in decision support in medical fields.

Downstream Use

The model is also suitable for downstream applications where it is integrated into larger healthcare AI systems, such as virtual medical assistants, patient query responders, or knowledge-based chatbot systems.

Out-of-Scope Use

This model is not designed for medical advice, diagnostic decisions, or critical health interventions. It should not be used as a substitute for professional medical consultation or treatment. Additionally, it may not work well for non-medical conversations or tasks outside of the trained scope.

Bias, Risks, and Limitations

The model has been fine-tuned with medical conversational data but still carries potential risks, such as the generation of incorrect or misleading medical information. Users should be aware of its limitations and ensure that it is used in an advisory capacity rather than for making medical decisions.

Recommendations

It is recommended that users of this model apply human oversight when utilizing the model for generating medical information, especially in contexts where accuracy and professionalism are paramount.

How to Get Started with the Model

To get started, load the model via the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Llama-8B")
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Llama-8B")

# Generate a response (example)
input_text = "What are the symptoms of diabetes?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"])
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Sadaf114
/

InshaSadaf-mental-health-chatbot