Finetuning script for Molmo
#14
by
2U1
- opened
I made a code for fine-tuning Molmo series.
https://github.com/2U1/Molmo-Finetune
However, the model is sort of a preview and has some limitations. It will be updated soon.
For now you can use
- LoRA/QLoRA
- Deepspeed
- Full-finetuning
- Flexibly select module to train
PRs and feedbacks are always welcome!
2U1
changed discussion title from
Molmo-Finetuning script
to Finetuning script for Molmo
Great work, started training to return point locations for given image and lang prompt; just wanted to check something though:
How to format point data?
Based on paper: https://arxiv.org/pdf/2409.17146
user_prompt = (
f"What to do to pick the object at: "
f'<point x="{q_point.point.x}" y="{q_point.point.y}" alt="{q_point.subtask}">{q_point.subtask}</point>?'
)
LLaVa format followed
is this correct?
Well based on the paper, the datas only show the answer that are formatted as the point so, I'm not sure but, the format for the point looks right.