How to eval the video/image sequences?
#69
by
lkllkl
- opened
Since the model can eval the single image perfectly, I wonder how to eval the video/image sequences. I change the message where the type is video and its not working.
Good day. As a starting point:
- Split the video into frames (e.g. jpg pictures) in an appropriate format.
- Pass each frame sequentially to the model with the corresponding prompt (iteratively). Save the result (can be in a list or dataframe).
Great discussion, once you have the video into frames, is there a way to process a batch of images together OR we can only process 1 image at a time?
@vibhu
There's a pretty good example in this discussion. https://huggingface.co/google/gemma-3-27b-it/discussions/73
Examine the link to Google Collab