Mistral-7B-Reviews-Insight-LoRA
Fine-tuned Mistral-7B for Insight Extraction from Product Reviews
Model Description
This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 using PEFT (Parameter-Efficient Fine-Tuning) with LoRA adapters. The model is specialized for extracting structured insights (pros and cons) from product reviews, outputting results in a standardized JSON format.
Fine-Tuning Objective
- Task: Extract pros and cons from product reviews.
- Output Format: Structured JSON with
pros
andcons
lists. - Use Case: Automated review analysis, sentiment mining, and product feedback summarization.
Intended Uses & Limitations
Intended Uses
- Product Review Analysis: Automatically extract pros and cons from customer reviews.
- Market Research: Summarize large volumes of feedback for product improvement.
- E-commerce Platforms: Enhance review sections with structured insights.
Limitations
- Inherited Limitations: Hallucinations, biases, and factual inaccuracies from the base model.
- Output Format: May occasionally generate malformed JSON.
- Domain Restrictions: Not suitable for sensitive domains (medical, legal, financial).
- Language: Optimized for English; performance may vary for other languages.
Dataset
Dataset Card
- Repository: sdelowar2/product_reviews_insight_10k
- Size: ~10,000 product reviews
- Format: JSON with
instruction
,input
(review_list), andanswer
fields whereanswer
field is also as json format withpros
andcons
keys.
Example Row
{
"instruction": "Generate pros and cons from the following product reviews.",
"input": [
"I bought this for my camera a year ago. My camera works only with SD memory so I am using the",
"It either works or it doesn't. I use it with my Zoom and it sill works. I mean seriously",
"No data loss. Stayed strong over the last few years and have re-formatted it a few times",
"Verizon Wireless only guaranteed this brand for my LG phone. I've since upgraded to a newer phone with a",
"This works great. Good luck finding them at retail any longer! No one in my area carried them, but",
"Great price even with the high shipping cost. No problem with the shipment, it arrived in 3 days. Product"
],
"answer": {
"pros": [
"No data loss over time",
"Works well with various devices",
"Great price despite shipping costs",
"Fast shipping, arrived in 3 days"
],
"cons": [
"Availability issues in local stores",
"Inconsistent performance at times"
]
}
Training Details
Hyperparameters
- Epochs: 3
- Optimizer: AdamW
- Learning Rate: 3e-4
- Batch Size: 2 (with gradient accumulation of 8)
- Precision: BF16 (fallback to FP16)
- Max Length: 512
LoRA Configuration
- Rank (r): 16
- Alpha: 32
- Dropout: 0.05
- Target Modules: [q_proj, k_proj, v_proj, o_proj]
Evaluation Metrics
- JSON Validity Rate: % of outputs with valid JSON structure.
- Semantic F1 (Pros/Cons): Strict and loose F1 scores for semantic accuracy (using bert-score).
How to Use
Installation
pip install -q -U bitsandbytes==0.42.0
pip install -q -U peft==0.8.2
pip install -q -U accelerate==0.27.0
pip install -q -U transformers==4.38.0
Load Model & Adapter
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
model_id = "mistralai/Mistral-7B-Instruct-v0.2"
adapter_id = "sdelowar2/mistral-7B-reviews-insight-lora"
# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype="bfloat16",
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
# Load LoRA adapter
model = PeftModel.from_pretrained(model, adapter_id)
Inference Example
reviews = [
"I am using this with a Nook HD+. It works as described. The HD picture on my Samsung 52",
"The price is completely unfair and only works with the Nook HD and HD+. The cable is very wobb",
"This adaptor is real easy to setup and use right out of the box. I had not problem with it",
"This adapter easily connects my Nook HD 7" to my HDTV through the HDMI cable.",
"Gave it five stars because it really is nice to extend the screen and use your Nook as a streaming"
]
SYSTEM_PROMPT = "You are an assistant that extracts pros and cons from product reviews. Return only valid JSON with keys pros and cons."
user_content = "Instruction: Extract pros and cons from the following reviews.\nReviews:\n" + "\n".join(f"- {r}" for r in reviews)
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_content},
]
# Apply chat template
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
# Generate
with torch.no_grad():
outputs = model.generate(
input_ids=input_ids,
max_new_tokens=300,
temperature=0.2,
do_sample=False
)
# Decode
response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
try:
parsed = json.loads(response)
print(f"result: {parsed}")
except Exception as e:
print("\nCould not parse JSON:", e)
print(f"raw output: {response}")
Example Output
{
'pros': [
'Works as described',
'Easy to set up and use',
'Connects Nook to HDTV',
'Extends screen for streaming'
],
'cons': [
'Unfair price',
'Cable feels wobbly'
]
}
Evaluation Metrics
Metric | Score |
---|---|
JSON Validity Rate | 78.7% |
Semantic F1 (Pros) | 0.89 |
Semantic F1 (Cons) | 0.87 |
Limitations & Bias
- Hallucinations: May generate incorrect or nonsensical pros/cons.
- Bias: Inherits biases from the base model and training data.
- Domain: Optimized for product reviews; avoid use in sensitive domains.
- Robustness: Performance may degrade for noisy or non-standard reviews.
Citation
@misc{sdelowar2_mistral7b_reviews_insight_lora,
author = {Md Sayed Delowar},
title = {Mistral-7B-Reviews-Insight-LoRA: Fine-tuned Mistral-7B for Product Review Insight Extraction},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Hub},
howpublished = {\url{https://huggingface.co/sdelowar2/mistral-7B-reviews-insight-lora}}
}
- Downloads last month
- 113
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for sdelowar2/mistral-7B-reviews-insight-lora
Base model
mistralai/Mistral-7B-Instruct-v0.2