flexicious (Shon Fernandez)

From developing LLM applications over the past couple years, I've realized that regardless of what the hype is all about - nothing beats testing LLMS on your own specific use cases using your own evaluation metrics. For example, I did a comparison of O3-mini vs R1 vs Gemini Flash thinking https://www.youtube.com/watch?v=iBS_FsLcSN0 and realized for certain use cases, they are no better than regular non reasoning models. I am very curious to learn what people are using reasoning models for and how they are evaluating them!

Shon Fernandez

AI & ML interests

Recent Activity

Organizations

flexicious's activity