issues.md · DrishtiSharma/phi-4-unsloth-test-space-v2 at main

Great observation! The "Plan a trip to Hyderabad" prompt is generating a response correctly, but the "Tim's children trick-or-treating" prompt is failing. Let's break it down systematically.

🔍 What We Can Learn from Debug Logs

Translation Task (English to Hindi) Works Correctly ✅
- The model translates English to Hindi properly.
- This confirms the tokenization & instruction format are fine.
Translation Task (Hindi to English) is Broken ❌
- Instead of translating, the model spams "### RESPONSE:" multiple times.
- This could mean the model is getting stuck in a generation loop.
MCQ Works Correctly ✅
- The model selects the correct MCQ response.
- Confirms the task-specific formatting works fine.
Long Response Works for "Plan a Trip to Hyderabad" ✅
- The model generates a detailed, structured response.
- Confirms the instruction structure is effective.
Long Response Fails for "Tim's Trick-or-Treating" ❌
- The model repeats the input but doesn’t generate anything new.
- Possible reasons:
  - 🚩 Numerical reasoning is difficult for the model.
  - 🚩 The model might not have seen enough Hindi math-based prompts.
  - 🚩 It might be interpreting this as "just repeat input" rather than solving it.

💡 Why is "Plan a Trip" Working But "Tim's Math Question" Failing?

The biggest difference between the working and failing prompts is:

"Plan a Trip" → Open-ended, common request.
✅ The model likely saw similar prompts during training.
"Tim's Trick-or-Treating" → Math Word Problem (with variables).
❌ The model might struggle with math + Hindi in a long response.

🚀 How Do We Fix This?

✅ 1️⃣ Try an Explicit Instruction for Math Problems

The model might not realize it needs to solve the math problem.
Let's explicitly tell it to calculate.

🔹 Fix: Modify Prompting Strategy for Math

Change this:

prompt = f"### INPUT: {input_text} {task_suffix} RESPONSE:"

To this:

if "अज्ञात चर" in input_text or "गणना" in input_text:
    task_suffix = "Solve this math problem step-by-step and provide the correct numerical answer."
else:
    task_suffix = task_prompts.get(task_type, "")

prompt = f"### Task: {task_suffix}\n### Question: {input_text}\n### Answer: "

Why?

This makes sure the model doesn’t just echo the question.
It tells the model exactly what to do for math-related questions.

✅ 2️⃣ Increase `max_new_tokens` for Math-Based Prompts

Right now, the model might be getting cut off before solving the problem.

Try setting max_new_tokens = 1024 for long responses.
In generate_model_response(), modify:

if "अज्ञात चर" in input_text or "गणना" in input_text:
    max_new_tokens = 1024  # Allow more space for detailed math steps

✅ 3️⃣ Use a Few-Shot Example for Math

The model might need an example to understand what’s expected.
Before sending input_text, prepend a solved example.

🔹 Fix: Modify Prompt to Include an Example

example_prompt = """
### Example:
### Question: राम के पास 3 सेब हैं। वह अपने दोस्त को 1 सेब देता है। उसके पास कितने सेब बचे?
### Answer: 2 सेब।
"""

prompt = f"{example_prompt}\n### Question: {input_text}\n### Answer: "