fdurant
/

colbert-xm-for-inference-api

Sentence Similarity

passage-retrieval

Model card Files Files and versions

fdurant commited on Jun 13, 2024

Commit

acc9003

·

1 Parent(s): 4c96de6

add ADDITIONAL_README.md

Files changed (1) hide show

ADDITIONAL_README.md +43 -0

ADDITIONAL_README.md ADDED Viewed

	@@ -0,0 +1,43 @@

+# Multilingual Colbert embeddings as a service
+## Goal
+- Deploy [Antoine Louis](https://huggingface.co/antoinelouis)' [colbert-xm](https://huggingface.co/antoinelouis/colbert-xm) as an inference service: text(s) in, vector(s) out
+## Motivation
+- use the service in a broader RAG solution
+## Steps followed
+- Clone the original repo following [this procedure](https://huggingface.co/docs/hub/repositories-next-steps#how-to-duplicate-or-fork-a-repo-including-lfs-pointers)
+- Add a custom handler script as described [here](https://huggingface.co/docs/inference-endpoints/guides/custom_handler)
+## Local development and testing
+### Build and start docker container hf_endpoints_emulator
+See [hf_endpoints_emulator](https://pypi.org/project/hf-endpoints-emulator/)
+````bash
+docker-compose up -d --build
+````
+This can take a few moments to load, given the size of the model (> 3 GB)!
+## How to test locally
+```bash
+./embed_single_query.sh
+./embed_two_chunks.sh
+```
+```bash
+docker-compose exec hf_endpoints_emulator pytest
+```
+## Check output
+```bash
+docker-compose logs --follow hf_endpoints_emulator
+```