docs: Readme Updated for optimized Usage with transformers library

#60
by sayed99 - opened

python code for transformers usage updated to use flash-attn as attention implementation to boost the performance and reduce memory usage.

PaddlePaddle org

@sayed99 Great work on this, and thank you for your contribution!

To ensure a smooth out-of-the-box experience for all users, we think it’s better to make flash-attn optional instead of default. To save you some time, I’ll go ahead and push a small commit to this PR to make that change.

Also, I suggest removing these two steps. They don’t seem necessary for a code example and removing them would simplify it.

from google.colab import files
...
# 2- Upload image (drag & drop any PNG/JPG)
...
# 3. Resize max-2048 preserving aspect ratio
...

Thanks again for the excellent work!

@xiaohei66
Thank you for the suggestions and for helping improve the PR! I agree that making flash-attn optional and removing the extra Colab steps will simplify the example and make it more user-friendly. I appreciate your help in pushing the small commit, looking forward to reviewing the changes.

@xiaohei66
Hello, Thanks for your efforts,
I wonder if that merge would be merged to the main model card automatically soon?

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment