tasal9 commited on
Commit
955f0af
·
verified ·
1 Parent(s): da34a96

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -1
README.md CHANGED
@@ -29,4 +29,48 @@ This directory contains tools and utilities for working with multilingual embedd
29
  - **Vector Database**: ChromaDB
30
  - **Integration Framework**: LlamaIndex
31
 
32
- ## Directory Structure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  - **Vector Database**: ChromaDB
30
  - **Integration Framework**: LlamaIndex
31
 
32
+ ## Directory Structure
33
+
34
+ embeddings/
35
+ ├── setup.py # Setup script for the embeddings model and vector store
36
+ ├── demo.py # Demo application with Gradio web UI
37
+ ├── indexer.py # Utility for indexing new documents
38
+ ├── requirements.txt # Dependencies for the embeddings components
39
+ └── chroma_db/ # Directory for the vector database (created on first run)
40
+
41
+ Getting Started
42
+
43
+ 1. Install the dependencies:
44
+
45
+ pip install -r models/embeddings/requirements.txt
46
+
47
+
48
+ 2. Add documents to index:
49
+
50
+ # Place your text files in the data/text_corpus directory
51
+ python models/embeddings/indexer.py --corpus data/text_corpus/
52
+
53
+
54
+ 3. Run the demo application:
55
+
56
+ python models/embeddings/demo.py
57
+
58
+
59
+
60
+ Using the Embeddings in Your Code
61
+
62
+ from models.embeddings.setup import setup_embedding_model
63
+
64
+ # Initialize the model and related components
65
+ embedding_components = setup_embedding_model()
66
+
67
+ # Get the query engine
68
+ query_engine = embedding_components["query_engine"]
69
+
70
+ # Query in any language
71
+ result = query_engine.query("What is the capital of Afghanistan?")
72
+ # Or in Pashto
73
+ result = query_engine.query("د افغانستان پلازمېنه څه ده؟")
74
+
75
+ print(result)
76
+