10 63 154

Simeon Emanuilov PRO

s-emanuilov

https://unfoldai.com/

AI & ML interests

Software Engineer, PhD | Building production ML/DL systems and AI tools

Recent Activity

liked a model about 2 months ago

microsoft/Fara-7B

posted an update about 2 months ago

Converted PaddleOCR models to ONNX for easier deployment and faster inference. These have been working well in production at Monkt.com, so figured I'd share them with the community. Just straight conversions of the original models—might save you some time if you're building OCR pipelines. https://huggingface.co/monkt/paddleocr-onnx

commented on a paper about 2 months ago

Stemming Hallucination in Language Models Using a Licensing Oracle

View all activity

Organizations

Posts 7

Post

322

Converted PaddleOCR models to ONNX for easier deployment and faster inference.

These have been working well in production at Monkt.com, so figured I'd share them with the community.

Just straight conversions of the original models—might save you some time if you're building OCR pipelines.

monkt/paddleocr-onnx

Post

321

Ran MTEB evaluation on Bulgarian tasks comparing EmbeddingGemma-300M ( google/embeddinggemma-300m) vs Multilingual-E5-Large ( intfloat/multilingual-e5-large)

EmbeddingGemma-300M scored 71.6% average while E5-Large got 75.9%. Pretty solid results for EmbeddingGemma considering it's half the size and uses way less resources.

EmbeddingGemma actually beats E5-Large on sentiment analysis and natural language inference. E5-Large wins on retrieval and bitext mining tasks.

The 300M model has 4x longer context window (2048 vs 512 tokens) and lower carbon footprint which is good.

Both models work great for Bulgarian but have different strengths depending what you need.

Blog article about the usage: https://huggingface.co/blog/embeddinggemma

PS: Don't forget to use the recommended libraries versions :D

pip install git+https://github.com/huggingface/transformers@v4.56.0-Embedding-Gemma-preview
pip install sentence-transformers>=5.0.0

View all Posts

Collections 9

View 9 collections

Papers 6

models 11

datasets 6

s-emanuilov/rivers-knowledge-base

Viewer • Updated Nov 8, 2025 • 17.7k • 3

s-emanuilov/rivers-knowledge-graph

Updated Nov 8, 2025 • 16

s-emanuilov/rivers-qa

Viewer • Updated Nov 6, 2025 • 17.7k • 24

s-emanuilov/rivers-evaluation-results

Viewer • Updated Nov 6, 2025 • 17.7k • 5

s-emanuilov/query-expansion

Viewer • Updated Dec 25, 2024 • 5.73k • 104 • 3

s-emanuilov/coco-clip-vit-l-14

Viewer • Updated Mar 29, 2024 • 123k • 20 • 2