using Jina models in a RAG stack
Hi all.
I'm very interested in Jina Embeddings v4 model because in my company we need to develop RAG applications based on our knowledge, built on text documents and "scanned" pdf.
To date we use unstructured.io (ingestion + chunking + embedding) -> astra db (vector db) -> langflow (retireval) -> python chatbot.
I understood that Jina Embeddings is just an embedding model, but what I have to use in addition if I need for chunks on vector db, useful (or almost mandatory) on retrieval phase (well... not retrieval, but "for answering" to user prompt with the aid of an LLM)?
If I need an external OCR for chunks, I can't figure out the power of Jina Embeddings, except for something close to an very generic document search (like the file search in MS Windows I mean).
Can you help me better understand, please?
Thank you
Hi @giovanni-elcam ,
One unique feature of jina-embeddings-v4
is that it can search through document pages directly, without the need for OCR. You can encode a screenshot of each document page (this returns an embedding for each page) and then find the most relevant one by comparing it to a query embedding. This is useful when you want to utilize visual features of the document, such as plots and illustrations. Afterwards, if your RAG system uses a VLLM, you can directly pass the relevant screenshot to get an answer; otherwise, you can do OCR on the relevant page and pass it in textual form. If this approach doesn't fit your use case, you can also use jina-embeddings-v4
as a standard text embedding model, just as you do with other models.
Hope this helps.