Questions on Replicating the Work

#2
by hycanon - opened

Thank you for all the work you've done and for sharing the dataset!
I am attempting to replicate your work by using the training configurations you provided to train Qwen2.5-7B-Instruct. However, I’ve encountered some issues: the trained model fails to reach the same performance as Diabetica-7B, and only shows marginal improvement.
Additionally, I noticed a discrepancy between the number of training samples mentioned in your paper and the number present in the Diabetica-sft dataset.
I would appreciate any assistance you could provide in clarifying this discrepancy and explaining the changes in sample counts.
Thank you again for your support!

Sign up or log in to comment