PHI4

by mans0987 - opened about 17 hours ago

Discussion

mans0987

about 17 hours ago

How does this model compare with PHI4 multi-modal?

jw2yang

Microsoft org about 17 hours ago

interesting question, they are developed from different objectives. It is hard to compare apple to apple.

mans0987

about 16 hours ago

can you please elaborate? when should I use Phi4 and when should I use this model (assuming that I am only interested in text and Vison and not audio which Phi4 has but this model doesn't have)?

jw2yang

Microsoft org about 16 hours ago

If you are only interested in text and vision, Magma is good at spatial understanding and reasoning for multimodal inputs, but phi4 is better at reading texts from the images based on my rough glimpse.

mans0987

about 16 hours ago

I am looking to detect if a house image shows the entry to that house, what is your suggestion?

jw2yang

Microsoft org about 7 hours ago

I think you can try both for this task.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment