3 9 46

Alexey Dertev

d8rt8v

AI & ML interests

None yet

Recent Activity

upvoted a collection 5 days ago

GLM-4.6

liked a model 23 days ago

google/embeddinggemma-300m

reacted to prithivMLmods's post with ❤️ 26 days ago

Introducing Gliese-OCR-7B-Post1.0, a document content-structure retrieval VLM designed for content extraction(OCRs) and summarization. This is the third model in the Camel Doc OCR VLM series, following Camel-Doc-OCR-062825. The new version fixes formal table reconstruction issues in both En and Zh, achieving optimal performance for long-context inferences. This model also shows significant improvements in LaTeX and Markdown rendering for OCR tasks. 🤗 Gliese-OCR-7B-Post1.0 : https://huggingface.co/prithivMLmods/Gliese-OCR-7B-Post1.0 📌 Gliese-Post1.0 Collection : https://huggingface.co/collections/prithivMLmods/gliese-post10-68c52c4a6ca4935f5259a6d7 ⬅️ Previous Versions : https://huggingface.co/prithivMLmods/Camel-Doc-OCR-062825 🧨 Gliese-OCR-7B-Post1.0 (4-bit) Notebook Demo on T4 : https://huggingface.co/prithivMLmods/Gliese-OCR-7B-Post1.0/blob/main/Gliese-OCR-7B-Post1.0(4-bit)-reportlab/Gliese_OCR_7B_Post1_0(4_bit)_reportlab.ipynb 📖 GitHub [Gliese-OCR-7B-Post1.0(4-bit)-reportlab] : https://tinyurl.com/ys7zuerc Other Collections: ➔ Multimodal Implementations : https://huggingface.co/collections/prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0 ➔ Multimodal VLMs - Aug'25 : https://huggingface.co/collections/prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd ➔ Multimodal VLMs - July'25 : https://huggingface.co/collections/prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027 . . . To know more about it, visit the app page or the respective model page!!

View all activity

Organizations

None yet

upvoted a collection 5 days ago

GLM-4.6

Collection

5 items • Updated 11 days ago • 21

liked a model 23 days ago

google/embeddinggemma-300m

reacted to prithivMLmods's post with ❤️ 26 days ago

Post

7169

Introducing Gliese-OCR-7B-Post1.0, a document content-structure retrieval VLM designed for content extraction(OCRs) and summarization. This is the third model in the Camel Doc OCR VLM series, following Camel-Doc-OCR-062825. The new version fixes formal table reconstruction issues in both En and Zh, achieving optimal performance for long-context inferences. This model also shows significant improvements in LaTeX and Markdown rendering for OCR tasks.

🤗 Gliese-OCR-7B-Post1.0 : prithivMLmods/Gliese-OCR-7B-Post1.0
📌 Gliese-Post1.0 Collection : prithivMLmods/gliese-post10-68c52c4a6ca4935f5259a6d7
⬅️ Previous Versions : prithivMLmods/Camel-Doc-OCR-062825
🧨 Gliese-OCR-7B-Post1.0 (4-bit) Notebook Demo on T4 : prithivMLmods/Gliese-OCR-7B-Post1.0
📖 GitHub [Gliese-OCR-7B-Post1.0(4-bit)-reportlab] : https://tinyurl.com/ys7zuerc

Other Collections:

➔ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
➔ Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
➔ Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027

.
.
.
To know more about it, visit the app page or the respective model page!!

2 replies

New activity in cpatonn/Qwen3-Next-80B-A3B-Instruct-AWQ-4bit 29 days ago

Error when running in VLLM

👍 2

#1 opened 29 days ago by

d8rt8v

liked a model 29 days ago

Qwen/Qwen3-Next-80B-A3B-Instruct

Text Generation • 81B • Updated 24 days ago • 5.12M • • 810

liked 2 models about 2 months ago

nvidia/parakeet-tdt-0.6b-v3

Automatic Speech Recognition • Updated 23 days ago • 197k • 341

janhq/Jan-v1-4B

Text Generation • 4B • Updated Aug 23 • 3.81k • 342

liked a Space 6 months ago

788

Qwen3 Demo

📊

Generate responses to text prompts in a chat interface

liked 2 models 6 months ago

OuteAI/Llama-OuteTTS-1.0-1B-GGUF

Text-to-Speech • 1B • Updated Jul 10 • 30.3k • 83

bartowski/deepcogito_cogito-v1-preview-qwen-32B-GGUF

Text Generation • 33B • Updated Apr 9 • 5.74k • 13

reacted to onekq's post with 😎 6 months ago

Post

2589

We desperately need GPU for model inference. CPU can't replace GPU.

I will start with the basics. GPU is designed to serve predictable workloads with many parallel units (pixels, tensors, tokens). So a GPU allocates as much transistor budget as possible to build thousands of compute units (Cuda cores in NVidia or execution units in Apple Silicon), each capable of running a thread.

But CPU is designed to handle all kinds of workloads. CPU cores are much larger (hence a lot fewer) with branch prediction and other complex things. In addition, more and more transistors are allocated to build larger cache (~50% now) to house the unpredictable, devouring the compute budget.

Generalists can't beat specialists.