Optimizing Input for Faster Rendering with NanoNet

#34

by FastaLaPasta - opened 12 days ago

12 days ago

Hi, I'm wondering if there are ways to optimize the input I send to the model in order to achieve faster rendering times. Are there any recommended preprocessing steps, input formats, or parameters (liek resolution, token length, or batch size) that could help improve inference speed?
thanks :)

Souvik3333

Nanonets org 11 days ago

Hello, So assuming you are using vLLM for deployment.

Generally when you increase resolution the processing time will increase and acc will increase. But I have seen the acc drop after increasing the width more than 3000. You can even decrease resolution incase your document is not complex or does not have any dense text
Increasing batch size will increase the throughput of the model overall but will not reduce single request processing time. If you are using large gpu you should always try to use as much batch size as possible.
By token length I assuming you mean generation length? Dense text will take more time because these models are auto-regressive (they generate one token at a time). But if you want to convert your document fully to markdown you cannot do anything here. For dense documents it will take more than than documents with less text.
Using better GPUs will make the computation faster. Or you can try https://docstrange.nanonets.com. We give free processing of 10k docs per month.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment