Finetuning script for Molmo

#14

by 2U1 - opened Sep 30, 2024

2U1

Sep 30, 2024

I made a code for fine-tuning Molmo series.
https://github.com/2U1/Molmo-Finetune
However, the model is sort of a preview and has some limitations. It will be updated soon.

For now you can use

LoRA/QLoRA
Deepspeed
Full-finetuning
Flexibly select module to train

PRs and feedbacks are always welcome!

2U1 changed discussion title from Molmo-Finetuning script to Finetuning script for Molmo Sep 30, 2024

jtattershall

22 days ago

Great work, started training to return point locations for given image and lang prompt; just wanted to check something though:

How to format point data?
Based on paper: https://arxiv.org/pdf/2409.17146

user_prompt = (
    f"What to do to pick the object at: "
    f'<point x="{q_point.point.x}" y="{q_point.point.y}" alt="{q_point.subtask}">{q_point.subtask}</point>?'
)

LLaVa format followed

is this correct?

2U1

21 days ago

Well based on the paper, the datas only show the answer that are formatted as the point so, I'm not sure but, the format for the point looks right.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment